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About this Manual 


This manual provides a general overview of designing Field 
Programmable Gale Arrays (FPGAs) with Hardware Description 
Languages (HDLs). It includes design hints for the novice HDL user, 
as well as for the experienced user who is designing FPGAs for the 
first time. 

The design examples in this manual were created with Verilog and 
VHSIC Hardware Description Language (VHDL); compiled with 
various synthesis tools; and targeted for XC4000, Spartan, and 
XC5200 devices. Xilinx equally endorses both Verilog and VHDL. 
VHDL may be more difficult to learn than Verilog and usually 
requires more explanation. 

This manual does not address certain topics that are important when 
creating HDL designs, such as the design environment; verification 
techniques; constraining in the synthesis tool, test considerations; 
and system verification. Refer to your synthesis tool's reference 
manuals and design methodology notes for additional information. 

Before using this manual, you should be familiar with the operations 
that are common to all Xilinx software tools. These operations are 
covered in the Quick Shirt Guide. 

Additional Resources 

For additional information, go to http://suppiirt.xilinx.com. The 
following table lists some of the resources you can access from this 
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page. You can also directly access some of these resources using the 
provided URLs. 


Resource 

Description.'URL 

Tutorial 

Tutorials covering Xilinx design flows, from design entry to verification 
and debugging 

http:/ /support.xilinx.eom/support/techsup/tutorials/index.htm 

Answers 

Database 

Current listing of solution records for the Xilinx software tools 

Search this database using the search function at 
http://support.xilinx.eom/support/searchtd.htm 

Application 

Notes 

Descriptions of device-specific design techniques and approaches 
http:/ / www.support.xilinx.com/apps/appsweb.htm 

Data Book 

Pages from The Programmable Logic Data Book, which describe device- 
specific information on Xilinx device characteristics, including read- 
back. boundary scan, configuration, length count, and debugging 
http://www.support.xilinx.com/partinfo/databook.htm 

mm 

Quarterly journals for Xilinx programmable logic users 
http://www.supporl.xilinx.com/xcell/xcell.htm 

Tech Tips 

Latest news, design tips, and patch information on the Xilinx design 
environment 

http://www.support.xilinx.com/support/techsup/journals/index.htm 


Manual Contents 


• Chapter 1. "Getting Started," provides a general overview of 
designing Field Programmable Gate Arrays (FPGAs) with HDLs. 
This chapter also includes installation requirements and instruc¬ 
tions. 

• Chapter 2. "HDL Coding Hints,” includes HDL coding hints and 
design examples to help you develop an efficient coding style. 

• Chapter 3, "Understanding High-Density Design Flow,” provides 
synthesis and Xilinx implementation techniques to increase 
design performance and utilization. 

• Chapter 4. "Designing FPGAs with HDL" includes coding tech¬ 
niques to help you improve synthesis results. 

• Chapter 5, "Simulating Your Design," describes simulation 
methods for verifying the function and timing of your designs. 
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• Appendix A, “Accelerate FPC.A Macros with One-Hot 
Approach,” reprints an article describing one-hot encoding in 
detail. 

• Appendix B. "Report Files," includes area and timing report files 
from various synthesis vendors. 
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Conventions 


This manual uses the following typographical and online document 
conventions. An example illustrates each typographical convention. 

Typographical 

The following conventions are used for all documents. 

• Courier font indicates messages, prompts, and program files 
that the system displays. 

speed grade: -100 

• Courier bold indicates literal commands that you enter in a 
syntactical statement. However, braces "{}” in Courier bold are 
not literal and square brackets "| |" in Courier bold are literal 
only in the case of bus specifications, such as bus |7:0|. 

rpt del net= 

Courier bold also indicates commands that you select from a 
menu. 

File —> Open 

• Italic font denotes the following items. 

• Variables in a syntax statement for which you must supply 
values 

edif 2 ngd design name 

• References to other manuals 

See the Development System Reference Guide for more informa¬ 
tion. 


ix 
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• Emphasis in text 

If a wire is drawn so lhal il overlaps the pin of a symbol. Ihe 
two nets are not connected. 

• Square brackets "| J" indicate an optional entry or parameter. 
However, in bus specifications, such as bus 17:0], they are 
required. 

cdif 2 ngd [option .iiirme] design jut me 

• Braces "|)" enclose a list of items from which you must choose 
one or more. 

lowpwr Hon I off | 

• A vertical bar " I" separates items in a list of choices, 
lowpwr ={onI off | 

• A vertical ellipsis indicates repetitive material that has been 
omitted. 

IOB H: Name = Q0UT' 

103 » 2 : Name = CLKXN' 


• A horizontal ellipsis "..." indicates that an item can be repeated 
one or more times. 

allow block block name loci loc 2 ... locn: 


Online Document 

The following conventions are used for online documents. 

• Red-underlined text indicates an inteibook link, which is a cross- 
reference to another book. Click the red-underlined text to open 
the specified cross-reference. 

• Blue-underlined text indicates an intrabook link, which is a cross- 
reference within a book. Click Ihe blue-underlined text to open 
the specified cross-reference. 
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Chapter 1 


Getting Started 


This chaplet provides a general overview of designing Field 
Programmable Gale Arrays (FPGAs) wilh HDLs and also includes 
installation requirements and instructions. It includes the following. 

• "Introduction" 

• "Advantages of Using HDLs to Design FPGAs" 

• "Designing FPGAs with HDLs” 

• “Installing Design Examples and Tactical Software" 

• "Technical Support" 

Introduction 

Hardware Description Languages (HDLs) are used to describe the 
behavior and structure of system and circuit designs. This chapter 
includes a general overview of designing FPGAs with HDLs. System 
requirements and installation instructions are also provided. 

To learn more about designing FPGAs with HDLs. Xilinx recom¬ 
mends that you enroll in the appropriate training classes offered by 
Xilinx and by the vendors of synthesis software. An understanding of 
FPGA architecture allows you to create HDL code that effectively 
uses FPGA system features. 

Before you start to create your FPGA designs, refer to the current 
version of the Quick Start Guide for Xilinx Alliance Series for a 
description of the design flow; installation information; and general 
information on the Xilinx tools. 

For the latest information on Xilinx parts and software, visit the 
Xilinx Web site at http://www.xilinx.com. On the Xilinx home page, 
click on Service and Support, and use the Customer Service and 
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Support page to get answers to your technical questions. You can also 
use the File Download option to download the latest software 
patches, tutorials, design files, and documentation. 

Advantages of Using HDLs to Design FPGAs 

Using HDLs to design high-density FPGAs is advantageous for the 
following reasons. 

• Top-Down Approach for Large Projects—HDLs are used to 
create complex designs. The top-down approach to system 
design supported by HDLs is advantageous for large projects 
that require many designers working together. After the overall 
design plan is determined, designers can work independently on 
separate sections of the code. 

• Functional Simulation Early in the Design Flow—You can 
verify the functionality of your design early in the design flow by 
simulating the HDL description. Testing your design decisions 
before the design is implemented at the RTL or gate level allows 
you to make any necessary changes early in the design process. 

• Synthesis of HDL Code to Gates—You can synthesize your 
hardware description to a design implemented with gates. This 
step decreases design time by eliminating the traditional 
gate-level bottleneck. Synthesis to gates also reduces the number 
of errors that can occur during a manual translation of a hard¬ 
ware description to a schematic design. Additionally, you can 
apply the techniques used by the synthesis tool (such as machine 
encoding styles or automatic I/O insertion) during the optimiza¬ 
tion of your design to the original HDL code, resulting in greater 
efficiency. 

• Early Testing of Various Design Implementations—HDLs allow 
you to test different implementations of your design early in the 
design flow. You can then use the synthesis tool to perform the 
logic synthesis and optimization into gates. Additionally, Xilinx 
FPGAs allow you to implement your design at your computer. 
Since the synthesis time is short, you have more time to explore 
different architectural possibilities at the Register Transfer Level 
(RTL). You can reprogram Xilinx FPGAs to test several imple¬ 
mentations of your design. 
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Designing FPGAs with HDLs 

If you are more familiar with schematic design enliy. you may find il 
difficult at first to create HDL designs. You must make the transition 
from graphical concepts, such as block diagrams, state machines, 
flow diagrams, and truth tables, to abstract representations of design 
components. You can ease this transition by not losing sight of your 
overall design plan as you code in HDL. To effectively use an HDL. 
you must understand the syntax of the language; the synthesis and 
simulator software; the architecture of your target device; and the 
implementation tools. This section gives you some design hints to 
help you create FPGAs with HDLs. 

Using Verilog 

Verilog'' is popular for synthesis designs because it is less verbose 
than traditional VHDL, and it is standardized as IEEE-STD-1364-95. 

It was not originally intended as an input to synthesis, and many 
Verilog constructs are not supported by synthesis software. The 
Verilog examples in this manual were tested and synthesized with 
current, commonly-used FPGA synthesis software. The coding strate¬ 
gies presented in the remaining chapters of this manual can help you 
create HDL descriptions that can be synthesized. 

Using VHDL 

VHS1C Hardware Description Language (VHDL) is a hardware 
description language for designing Integrated Circuits (ICs). It was 
not originally intended as an input to synthesis, and many VHDL 
constructs are not supported by synthesis software. However, the 
high level of abstraction of VHDL makes it easy to describe the 
system-level components and test benches that are not synthesized. 

In addition, the various synthesis tools use different subsets of the 
VHDL language. The examples in this manual will work with most 
commonly used FPGA synthesis software. The coding strategies 
presented in the remaining chapters of this manual can help you 
create HDL descriptions that can be synthesized. 

Comparing ASICs and FPGAs 

Methods used to design ASICs do not always apply to FPGA designs. 
ASICs have more gate and routing resources than FPGAs. Because 
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ASICs have a large number of available resources, you can easily 
create inefficient code that results in a large number of gates. When 
designing FPGAs, you must create efficient code. 

Using Synthesis Tools 

Most of the commonly-used FPGA synthesis tools have special opti¬ 
mization algorithms for Xilinx FPGAs. Constraints and compiling 
options perform differently depending on the target device. There are 
some commands and constraints that do not apply to FPGAs and. if 
used, may adversely impact your results. You should understand 
how your synthesis tool processes designs before creating FPGA 
designs. Most synthesis vendors include information in their 
manualsspecifically for Xilinx FPGAs. 

Using FPGA System Features 

You can improve device performance and area utilization by creating 
HDL code that uses FPGA system features, such as global reset, wide 
I/O decoders, and memory. FPGA system features are described in 
this manual. 

Designing Hierarchy 

Current HDL design methods are specifically written for ASIC 
designs. You can use some of these ASIC design methods when 
designing FPGAs; however, certain techniques may unnecessarily 
increase the number of gates or CLB levels. 

Design hierarchy is important in the implementation of an FPGA and 
also during incremental or interactive changes. Some synthesizers 
maintain the hierarchical boundaries unless you group modules 
together. Modules should have registered outputs so their boundaries 
are not an impediment to optimization. Otherwise, modules should 
be as large as possible within the limitations of your synthesis tool. 
Tire "5,000 gates per module" rule is no longer valid, and can inter¬ 
fere with optimization. Cheek with your synthesis vendor for the 
current recommendations for preferred module size. As a last resort, 
use the grouping commands of your synthesizer, if available. The size 
and content of the modules influence synthesis results and design 
implementation. This manual describes how to create effective design 
hierarchy. 
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Specifying Speed Requirements 

To meet timing requirements, you should understand how to set 
timing constraints in both the synthesis and placement/routing tools. 

Installing Design Examples and Tactical Software 

Tine information in this manual supplements information in your 
synthesis and HDL simulator manuals. Before you start designing 
Xilinx FPGAs, read the Xilinx-specific information in your HDL 
manuals. Also, read and follow the instructions in the latest version 
of the Quick Start Guide for Xilinx Alliance Series, as well as the current 
version of the Alliance Series Install and Release Document. 

Tills manual includes numerous HDL design examples created with 
VHDL and Verilog. VHDL is more comprehensive than Verilog, and 
you many need to spend more time learning how to apply VHDL 
constructs to synthesis. 

Software Requirements 

To synthesize, simulate, and implement the design examples in this 
manual, you should have the current versions of your synthesis and 
simulation software, as well as the Alliance Series 2.1 or later version 
of the Xilinx Development System installed on your system. 

Memory Requirements 

The values provided in the following table are for typical designs, 
and include loading the operating system. Additional memory may 
be required for certain "boundary-case" or unusual designs, as well 
as for the concurrent operation of other applications (for example, 
synthesis or HDL simulation). Xilinx recommends compiling 
XC4000EX/ XL designs on the Ultra Sparc, HP715, or equivalent 
workstations. Although 64 MB of RAM and <A MB of swap space are 
required to compile XC4(XX)EX designs, Xilinx recommends that you 
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use 128 MB of RAM and 128 MB of swap space for more efficient 
processing of your XC4OO0EX designs. 

Table 1-1 Memory Requirements for Workstations and PCs 


Xilinx Device 

RAM 

Swap Space 

XC3000A/L 

XC3100A/L 

XC4000E/L 

XC4028EX through XC4036EX 
XC40O2XL through XC4028XL 
XCS (Spartan) 

XC5200 

XC9500 (small devices) 

64 MB 

64 MB- 128 MB 

XC4036XL through XC4l*>2Xl. 
XC9500 (large devices) 

128 MB 

128 MB - 256 MB 

XC4085XL 

XC40125XV 

256 MB 

256 MB - 512 MB 


Disk Space Requirements 

Before you install the programs and files, verify that your system 
meets the requirements listed in the following table for the applicable 
options. The disk space requirements listed are approximations and 
may not exactly match the actual numbers. To significantly reduce 
the amount of disk space needed, install only the components and 
documentation that you will actually use. In the following table, the 
Data column represents files that are common to all three workstation 
platforms. For example, for a Solaris machine, you need - 110 (12 
plus 98) MB of disk space. 

Note: Refer to the Alliance Series Install and Release Document for more 
information on disk space requirements. 


Table 1-2 Disk Space Requirements 


Software Component 

Data 

Sol 

HP 

Xilinx Core Technology 

-12 MB 

-98 MB 

-108 MB 

Xilinx Device Data Files 
(All devices)' 



-26 MB 
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Table 1-2 Disk Space Requirements 


Software Component 

Data 

Sol 

HP 

Documentation: 

-30 MB total 



Online Help 

Documentation Browser 

-17 MB 

-10 MB 

-10 MB 

Xilinx Tutorial Files 

-1 MB 



Xilinx Userware 

-4 MB 




a. The memory rvqmrt’nvnu *poc ifud arc for the (retaliation of all Xilinx devucv You can *ignifk'ant)y 
reduce the amount of disk space required by installing Mtly the hie* far the device* you want to target. 


Xilinx Internet Site 

To download the programs and files from ihe Xilinx Internet Site, you 
must meet the disk requirements listed in the following table. 

Table 1-3 Internet Files 


Directory/Location 

Description 

Compressed File 

Directory 

Size 

Ml VHDL .source" 

All VHDL source code 
only (no scripts, compila¬ 
tion, or implementation 
files) 

ml vhdl src.tar.Z 
(size: 60 KB) 
or 

ml vhdl_src.zip 
(size: 68 KB) 

271 KB 

Ml Verilog source* 1 

All Verilog source code 
only (no scripts, compila¬ 
tion, or implementation 
files) 

ml verilog src.tar.Z 
(size: 57 KB) 
or 

m 1 _verilogsrc.zip 
(size: 64 KB) 

256 KB 

Ml HDL source* 

All VHDL and Verilog 
source code only (no 
scripts, compilation, or 
implementation files) 

ml hdl src.tar.Z 
(size: 110 KB) 
or 

ml hdl src.zip 
(size: 129 KB) 

497 KB 


a lhac file* arc located al flp://ftpjxtlUULCom/pub/applicallocu/3Rlparty 


Retrieving Tactical Software and Design Examples 

You can retrieve the HDL design examples from the Xilinx Internet 
Site. If you need assistance retrieving the files, use the information 
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listed in the "Technical Support" section of this chapter to contact the 
Xilinx Hotline. 

You must install the retrieved files on the same system as the Xilinx 
software and the synthesis and simulation tools. However, do not 
install the files into the directory with the current release of the soft¬ 
ware since they may get overwritten during the installation of the 
next veision of the software. 

From Xilinx Internet FTP Site 

You can retrieve the programs and files from the Xilinx Internet FTP 
(File Transfer Protocol) site. Alternatively, if you are not familiar with 
FTP, you can retrieve the files by going to the Xilinx Web site (http:// 
www.xilinx.com), clicking on Service and Support, and using the File 
Download option. To access the Xilinx FTP Site, you must either have 
an internet-capable FTP utility available on your machine or a Web 
browser that has FTP. To use FTP. your machine must be connected to 
the Internet and you must have permission to use FTP on remote 
sites. If you need more information on this procedure, contact your 
system administrator. 

To retrieve the progranvs and files from the Xilinx FTP site, use the 
following procedure. 

1. Co to the directory on your local machine where you want to 
download the files, as follows. 

cd directory 

2. Invoke the FTP utility or your Web browser that provides FTP. 

3. Connect to the Xilinx FTP site, ftp.xilinx.com as follows. 

£tp> open ftp.xilinx.com 

or 

Enter the following URL. 
ftp://ftp.xilinx.com 

4. Log into a guest account if the FTP utility or Web browser does 
not perform this automatically. This account gives you download 
privileges. 

Name (machine: user-name) : anonymous 
Guest login ok, send your complete e-mail 
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address as the password. 

Password : your _enutil, address 

5. Go to the following directory. 

ftp> cd pub/applications/3rdpocty 

6 . If you are using an FTP utility, make sure you are in binary mode. 

ftp> bin 

7. Retrieve the appropriate design files as follows, 
f t p> get design . files .t ar.Z 

or 

£tp> get designates, zip 
or 

Select the appropriate file and select a destination directory on 
your local machine. 

8. Extract the files as described in the next section. 

Extracting the Files 

You must install the retrieved files on the same system as the current 
release of the Xilinx software and the synthesis and simulation tools. 
However, do not install the files in the directory with the current soft¬ 
ware because they may get overwritten during the installation of the 
next version of the software. The files are stored in the UNIX stan¬ 
dard tar and compress form, as well as in the PC standard zip form. 
To extract the files, use one of the following procedures. 

Note: If the following procedures do not work on your system, 
consult your system administrator for help on extracting the files. 

Extracting .tar.Z File in UNIX 

1. Go to the directory where you downloaded the files, 
cd downloaded files 

2. Uncompress the files, 
uncompress design . tar. Z 

3. Extract the files. 
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toe xvf design. tor 

Extracting .zip File in UNIX 

1. Go to the directory where you downloaded the files, 
cd downloaded files 

2. Uncompress the files, 
unrip design . zip 

Extracting .zip File in MS-DOS 

1. Go to the directory where you downloaded the files: 
cd downloaded files 

2. Uncompress the files: 

pkunzip -d design.zip 

Directory Tree Structure 

After you have completed the installation you should have the 

following directory tree structure. 

Sk_prcsct 
/VHD1 
/Vcrilog 

/Async_RAM_a5_latch 
/VHDL 
/Vcrilog 
/Barrei__SR 
/VHDL 

/Barrel 
/Barrcl_Crg 
/Vcrilog 

/Barrel 
/Earrel_.Org 
/Bidir_LogiBLOX 
/VHDL 
/Vcrilog 
/Bidir_infer 
/VHDL 
/Vcrilog 

/Bidir — instant late 
/VHDL 
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/Verilog 
/Bnd_scan_4k 
/VHDL 
/Verilog 
/Bnd_scan_5k 
/VHDL 
/Verilog 
/Casc_vs_i£ 

/VHDL 

/Cjsc_cx 
/If.cx 
/Verilog 

/Casc_cx 

/If.cx 

/Clock_cnabic 

/VHDL 

/Verilog 

/Clock_jnux 

/VHDL 

/Verilog 

/Constants 

/VHDL 

/Verilog 

/Paramctcrl 

/Parameter2 

/D_iatch 
/VHDL 
/Verilog 
/Deregister 
/VHDL 
/Verilog 
/FF_cxample 
/VHDL 
/Verilog 
/GR.5X 

/VHDL 

/Act ivc_iow.GR 
/No_GR 
/Usc_GR 
/Verilog 

/Act ivc_iow.GR 
/ Mo_.GR 
/Use.GR 

/GSR 

/VHDL 
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/Activc_Iow_GSR 
/No_GSR 
/Use_G5R 
/Vcrilog 

/Activc_iow_GSR 
/No_GSR 
/Uso_GSR 
/Gatc_clock 
/VHDL 

/Gatc_clock 
/Gatc_clock2 
/Vcrilog 

/Gatc_clock 
/Gatc_clock2 
/ IO_Cccodcr 
/VHDL 
/Vcrilog 
/LogiBLOX_DP_RAM 
/VHDL 
/Vcrilog 
/LogiBLOX_SR 
/VHDL 
/Vcrilog 
/Hux_vs_3statc 
/VHDL 

/Mux_gato 
/Mux_gatcl6 
/ttjx_t but 
/Mux_tbuf 16 
/Vcrilog 

/»jx_gate 
/»jx_gatol6 
/»ux_tbuf 
/Mux_tbuf 16 
/Ncstcd_it 
/VHDL 

/l£_casc 

/Ncsted_i£ 

/Vcrilog 

/l£_casc 
/Ncstcd_i£ 
/CMUX_exarrpic 
/VHDL 
/Vcrilog 
/RAH-primitive 
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/VHDL 

/Vcrilog 

/ROX_RTL 

/VHDL 

/Vcrilog 

/Rcs_sharing 

/VHDL 

/ Re s_no_ share 
/Rcs_sharing 
/Vcrilog 

/Rcs_no_sharc 
/Rcs_sharing 
/3ct__and_Rcsct 
/VHDL 
/Vcrilog 
/Sig_vs_Var 
/VHDL 

/Xor.Sig 
/Xor_Var 
/St a^c_MachInc 
/VHDL 

/Binary 

/Enum 

/One_Hot 

/Vcrilog 

/Binary 

/Enum 

/Onc_Hot 

/Unbcndcd__IO 

/VHDL 

/Vcrilog 

Technical Support 

You cm contact Xilinx for additional information and assistance in 
the following ways. 

Xilinx World Wide Web Site 

Enter http://www.xilinx.com. Click on the Service and Support 
option on the Xilinx Home Page. Use the Customer Service and 
Support page to get answers to your technical questions. You can use 
the Answers Search option to search the Answers database, file 
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download area, application notes, XCELL journals, data sheets, and 
expert journals. 

Technical and Applications Support Hotlines 

Tile telephone hotlines give you direct access to Xilinx Application 
Engineers worldwide. You can also e-mail or fax your technical ques¬ 
tions to the same locations. 

Table 1-4 Technical Support 


Location 

Telephone 

Electronic Mail 

Facsimile (Fax) 

North America 

1-800-255-7778 

hotline@xiIinx.com 

1-408-879-4442 

Japan 

81-3-3297-9163 

jhotline@xilinx.com 

81-3-3297-0067 

France 

33-1-3463-0100 

frhelp@xilinx.com 

33-1-3463-0959 

Germany 

49-89-9915-4930 

dlhelp@xilinx.com 

49-89-904-4748 

United Kingdom 

44-1932-820821 

ukhelp@xilinx.com 

44-1932-828522 

Corporate Switchboard 

1-408-559-7778 




Note: When e-mailing or faxing inquiries, provide your complete 
name, company name, and phone number. Also, provide a complete 
problem description including your design entry software and design 
stage. 

Xilinx FTP Site 

ftp://ftp.xilinx.com 

Tine FTP site provides online access to automated tutorials, design 
examples, online documents, utilities, and published patches. 

XDOCS E-mail Server 

xd o<s@x il i nx .com 

Include the word "help" in the subject header. This e-mail service 
provides access to the Customer Service and Support page from the 
Xilinx World Wide Web Site. On the Xilinx home page, click on 
Service and Support, and use the Customer Service and Support page 
to get answers to your technical questions. 
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HDL Coding Hints 


This chapter contains HDL coding hints and design examples to help 
you develop an efficient coding style. It includes the following topics. 

• "Comparing Synthesis and Simulation Results" 

• "Selecting HDL Formatting Styles" 

• "Using Schematic Design Hints with HDL Designs" 

HDLs contain many complex constructs that are difficult to under¬ 
stand at first. Also, the methods and examples included in HDL 
manuals do not always apply to the design of FPGAs. If you 
currently use HDLs to design ASICs, your established coding style 
may unnecessarily increase the number of gates or CLB levels in 
FPC.A designs. 

HDL synthesis tools implement logic based on the coding style of 
your design. To learn how to efficiently code with HDLs, you can 
attend training classes, read reference and methodology notes, and 
refer to synthesis guidelines and templates available from Xilinx and 
the synthesis vendors. When coding your designs, remember that 
HDLs are mainly hardware description languages. You should try to 
find a balance between the quality of the end hardware results and 
the speed of simulation. 

The coding hints and examples included in this chapter are not 
intended to teach you every aspect of VHDL or Venlog, but they 
should help you develop an efficient coding style. 

Comparing Synthesis and Simulation Results 

VHDL and Verilog are hardware description and simulation 
languages that were not originally intended as input to synthesis. 
Therefore, many hardware description and simulation constructs are 
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not supported by synthesis tools. In addition, the various synthesis 
tools use different subsets of VHDL and Verilog. VHDL and Verilog 
semantics are well defined for design simulation. The synthesis tools 
must adhere to these semantics to ensure that designs simulate the 
same way before and after synthesis. Follow the guidelines presented 
below to create code that simulates the same way before and after 
synthesis. 

Omit the Wait for XX ns Statement 

Do not use the Wait for XX ns statement in your code. XX specifies the 
number of nanoseconds that must pass before a condition is 
executed. This statement does not synthesize to a component. In 
designs that include this statement, the functionality of the simulated 
design does not match the functionality of the synthesized design. 
VHDL and Verilog examples of the Wait for XX ns statement are as 
follows. 

• VHDL 

wait for XX ns; 

• Verilog 
*xx; 

Omit the ...After XX ns or Delay Statement 

Do not use the ...After XX ns statement in your VHDL code or the 
Delay assignment in your Verilog code. Examples of these statements 
are as follows. 

• VHDL 

(0 <=0 after XX ns) 

• Verilog 

assign exx 0 = 0 ; 

XX specifies the number of nanoseconds that must pass before a 
condition is executed. This statement is usually ignored by the 
synthesis tool. In this case, the functionality of the simulated design 
does not match the functionality of the synthesized design. 
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Use Case and If-Else Statements 

You can use If-Else statements. Case statements, or other conditional 
code to create state machines or other conditional logic. These state¬ 
ments implement the functions differently, however, the simulated 
designs are identical. The If-Else statement generally specifies 
priority-encoded logic and the Case statement generally specifies 
balanced behavior. The If-Else statement can, in some cases, result in 
a slower circuit overall. These statements vary with the synthesis 
tool. Refer to the "Comparing If Statement and Case Statement" 
section of this chapter for more information. 

Order and Group Arithmetic Functions 

The ordering and grouping of arithmetic functions can influence 
design performance. For example, the following two VHDL state¬ 
ments are not necessarily equivalent. 

ADD <= Al • A2 * A3 * A4j 
ADD <= <Al » A2> » (A3 * A4(; 

For Verilog. the following two statements are not necessarily equiva¬ 
lent. 

ADD = Al * A2 * A3 * Ai; 

ADD = |Al » A2) * (A3 * A41; 

The first statement cascades three adders in series. The second state¬ 
ment creates two adders in parallel. Al + A2 and A3 ■* A4. In the 
second statement, the two additions are evaluated in parallel and the 
results are combined with a third adder. RTL simulation results are 
the same for both statements, however, the second statement results 
in a faster circuit after synthesis (depending on the bit width of the 
input signals). 

Although the second statement generally results in a faster circuit, in 
some cases, you may want to use the first statement. For example, if 
the A4 signal reaches the adder later than the other signals, the first 
statement produces a faster implementation because the cascaded 
structure creates fewer logic levels for A4. This structure allows A4 to 
catch up to the other signals. In this case, Al is the fastest signal 
followed by A2 and A3. A4 is the slowest signal. 


Synthesis and Simulation Design Guide 


2-3 




Synthesis and Simulation Design Guide 


Most synthesis tools can balance or restructure the arithmetic oper¬ 
ator tree if timing constraints require it. Howev er, Xilinx recommends 
that you code your design for your selected structure. 

Omit Initial Values 

Do not assign signals and variables initial values because initial 
values are ignored by most synthesis tools. The functionality of the 
simulated design may not match the functionality of the synthesized 
design. 

For example, do not use initialization statements like the following 
V'HDL and Verilog statements. 

• VHDL 

variable SUM:INTEGER:=0; 

• Verilog 

wire SUM=l'bO; 

Selecting HDL Formatting Styles 

Because HDL designs are often created by design teams. Xilinx 
recommends that you agree on a style for your code at the beginning 
of your project. An established coding style allows you to read and 
understand code written by your fellow team members. Also, ineffi¬ 
cient coding styles can adversely impact synthesis and simulation, 
which can result in slow circuits. Additionally, because portions of 
existing HDL designs are often used in new designs, you should 
follow coding standards that are understood by the majority of HDL 
designers. This section of the manual provides a list of suggested 
coding styles that you should establish before you begin your 
designs. 

Selecting a Capitalization Style 

Select a capitalization style for your code. Xilinx recommends using a 
consistent style (lower or upper case) for entity or module names in 
FPGA designs. 

Verilog 

For Verilog. the following style is recommended. 
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• Use lower case letters lor the following. 

• Module names 

• Verilog language keywords 

• Use upper case letters for the following. 

• Labels 

• Reg, wire, instance, and instantiated cell names 

Note: Cell names must be upper case to use the UniSim simulation 
library and certain synthesis libraries. Check with your synthesis 
vendor. 

VHDL 

Note: VHDL is case-insensitive. 

For VHDL, use lower case for all language constructs from the IEEE- 
STD 1076. Any inpuLs defined by you should be upper case. For 
example, use upper case for the names of signals, instances, compo¬ 
nents, architectures, processes, entities, variables, configurations, 
libraries, functions, packages, data types, and sub-types. For the 
names of standard or vendor packages, the style used by the vendor 
or uppercase letters are used, as shown for IEEE in the following 
example: 

library IEEE; 

use IEEE.std_lcqic_1164.all; 
signal SIG: UNSIGNED |5 downto 0t; 

Using Xilinx Naming Conventions 

Use the Xilinx naming conventions listed in this section for naming 
signals, variables, and instances that are translated into nets, buses, 
and symbols. 

Note: Most synthesis tools convert illegal characters to legal ones. 

• User-defined names can contain A-Z, a-7, $. -, <, and >. A "/" 
is also valid, however, it is not recommended because it is used as 
a hierarchy separator 

• Names must contain at least one non-numeric character 

• Names cannot be more than 256 characters long 
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Tine following FPGA resource names are reserved and should not be 
used to name nets or components. 

• Components (Comps). Configurable Logic Blocks (CLBs), Input/ 
Output Blocks (IOBs). basic elements (bels), clock buffers 
(BUFGs), tristate buffers (BUFTs). oscillators (OSC), CCLK, DP. 
GND. VCC, and RST 

• CLB names such as AA, AB, and RIC2 

• Primitive names such as TDO, BSCAN, MO, Ml. M2, or STARTUP 

• Do not use pin names such as PI and A4 for component names 

• Do not use pad names such as PAD1 for component names 

Refer to the language reference manual for Verilog or VHDL for 
language-specific naming restrictions. Xilinx does not recommend 
using escape sequences for illegal characters. Also, if you plan on 
importing schematics into your design, use the most restrictive char¬ 
acter set. 

Matching File Names to Entity and Module Names 

Tire VHDL or Verilog source code file name should match the desig¬ 
nated name of the entity (VHDL) or module (Verilog) specified in 
your design file. This is less confusing and generally makes it easier 
to create a script file for the compilation of your design. Xilinx also 
recommends that if your design contains more than one entity or 
module, each should be contained in a separate file with the appro¬ 
priate file name. It is also a good idea to use the same name as your 
top-level design file for your synthesis script file with either a .do. 
.scr, .script, or the appropriate default script file extension for your 
synthesis tool. 

Naming Identifiers, Types, and Packages 

You can use long (256 characters maximum) identifier names with 
underscores and embedded punctuation in your code. Use mean¬ 
ingful names for signals and variables, such as 
CONTROL REGISTER. Use meaningful names when defining 
VHDL types and packages as shown in the following examples. 

type LCCA7ION_TYP£ 11_J 

package STKIN5_IO_PKG ii 
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Using Labels 

Use labels to group logic. Label all processes, functions, and proce¬ 
dures as shown in the following examples. Labeling makes it easier to 
debug your code. 

• VHDL 

ASYNC_FF: process <CLK,RST1 

• Verilog 

always $ (poscdgc CLX or poscdgc RSTJ 
begin: ASYNC_FF 

Labeling Flow Control Constructs 

You can use optional labels on flow control constructs to make the 
code structure more obvious, as shown in the following VHDL and 
Verilog examples. However, you should note that these labels are not 
translated to gate or register names in your implemented design. 
Flow control constructs can slow down simulations in some Verilog 
simulators. 

• VHDL Example 

— DEREGISTER.VHD 
— May 1997 

-- Changing Latch into a D-Reqister 
library IEEE; 

use IEEE.3td_lcgic_1164.all; 

entity deregister is 

port (CLX, DATA: in STD_LOGXC; 

Q: out STDetCGIC); 
end deregister; 

architecture BEHAV cf deregister is 
begin 

My D Reg: process <CLK ( DATA) 
begin 

if ICLK'event and CLK='l') then 
0 <= DATA; 
end if; 
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end process; --End My D Reg 
end BEHAV; 

• \ r erilog Example 

/" Changing Latch into a D-Register 

• DEREGISTER.V 

* May 1997 */ 

module deregister <CLK, DATA, Q); 

input CLK; 
input DATA; 
output Q; 

-eg 0; 

always £ {po sedge CLK) 
begin: My D Reg 

Q <= DATA; 

end 

endncdule 

Using Variables for Constants (VHDL Only) 

Do nol use variables for constanls in your code. Define constant 
numeric values in your code as constants and use them by name. This 
coding convention allows you to easily determine if several occur¬ 
rences of the same literal value have the same meaning. In some 
simulators, using constants allows greater optimization. In the 
following code example, the OPCODE values are declared as 
constants, and the constant names refer to their function. This 
method produces readable code that may be easier to modify. 

Using Constants to Specify OPCODE Functions 
(VHDL) 

constant ZERO : S7D_LDGIC_VECTOR (1 downto 0) :=*00*; 
constant A_AND_B: S7D_LDGIC_VECTOR (1 downto 0):="01"; 
constant A_OR_B : STD_LOGIC_VECTOR (1 downto 0):=*10"; 
constant C«E : STD_LOGIC_VECTOR <1 downto 0):="11*; 

process {OPCODE, A, BJ 
begin 
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if <OPCODE = A_AND_B)then CP_OUT <= A and B; 
elsif {OPCODE = A_OR_BJ then CP_GUT <= A or B; 
elsif (OPCODE = ONE) then CP_OUT <= 'V} 
else GP_GUT <= 'O'; 

end if; 
end process; 

Using Parameters for Constants (Verilog Only) 

You can specify a constant value in Verilog using the parameter special 
data type, as shown in the following examples. The first example 
includes a definition of OPCODE constants as shown in the previous 
VHDL example. The second example shows how to use a parameter 
statement to define module bus widths. 

Using Parameters to Specify OPCODE Functions 
(Verilog) 

parameter ZERO - 2'bflO; 
parameter A_AND_B = 2'bOl; 
parameter A_OR_B = 2'blO; 
parameter ONE = 2'bllj 

always £ (OPCODE or A cr B| 
begin 

if (OPCODE=='ZERO) OP_CUT=1'bfl; 

else if(OPCODE==*A_AND_B) OP_CUT=A4B; 
else if(OPCODE==*A_CR_B) OP_GUT-AiB; 
else OP_OUT=l'bi; 

end 

Using Parameters to Specify Bus Size (Verilog) 

par arret cr BUS_3IZE = 8; 


output ( % EUS_SIZE-1:01 OUT; 
input { % BUS_SIZE-1:0] X, Y ; 

Using Named and Positional Association 

Use positional association in function and procedure calls, and in 
port lists only when you assign all items in the list. Use named associ¬ 
ation when you assign only some of the items in the list. Also, Xilinx 
suggests that you use named association to prevent incorrect connec- 
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lions for the ports of instantiated components. Do not combine posi¬ 
tional and named association in the same statement as illustrated in 
the following examples. 

• VHDL 

Incorrect 

CLX_1: BUFGS pore map (I=>CLOCX_IN,CLCCK_OUTI; 

Correct 

CLX_1: BUFGS port map |I = >CtOCX_lN, 0->Cl.CCK_aUT| ; 

• Verilog 

Incorrect 

BUFGS CLK_1 (. 1 <CLOCX_lNJ , CLGCK_OUTJ; 

Correct 

BUFGS CLX_1 < . 1 (CLOCX_l>l| , .O |C1.CCK_CUTJ | ; 

Managing Your Design 

As part of your coding specifications, you should include rules for 
naming, organizing, and distributing your flies. In VHDL designs, 
use explicit configurations to control the selection of components and 
architectures that you want to compile, simulate, or synthesize. In 
some synthesis tools, configuration information is ignored. In this 
case, you only need to compile the architecture that you want to 
synthesize. 

Creating Readable Code 

Use the recommendations in this section to create code that is easy to 
mad. 

Indenting Your Code 

Indent blocks of code to align related statements. You should define 
the number of spaces for each indentation level and specify whether 
the Begin statement is placed on a line by itself. In the examples in 
this manual, each level of indentation is four spaces and the Begin 
statement is on a separate line that is not indented from the previous 
line of code. The examples below illustrate the indentation style used 
in this manual. 
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• VHDL Example 

— D_LATCH. VHD 
— May 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 

entity d_latch is 

port (GATE, DATA: in STD_LOGIC; 

0: out STD__LOGIC); 
end d_iatch; 

architecture BEHAV of d_Iatch is 
begin 

LATCH: process (GATE, DATAI 
begin 

if (GATE = '1') then 
Q <= DATA; 
end if; 

end process; — end LATCH 
end BEHAV; 

• Verilog Example 

/" Transparent High Latch 

• D_LATCH.V 

* May 1997 V 

module d_latch (GATE, DATA, Q); 

input GATE; 
input DATA; 
output 0 ; 

reg 0 ; 

always & (GATE or DATA) 
begin: LATCH 

if IGATE -= l'blj 
0 <= DATA; 
end // End Latch 

endncdule 
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Using Empty Lines 

Use empty lines to separate top-level constructs, designs, architec¬ 
tures, configurations, processes, subprograms, and packages. 

Using Spaces 

Use spaces to make your code easier to read. You can omit or use 
spaces between signal names as shown in the following examples. 

• VHDL Example 

process <RST,CLOCK, LOAD, CE) 
process <RST, CLOCK, LOAD, CE) 

• Verilog Example 

module test (A,B,C) 
module test (A, B, C) 

Use a space after colons as shown in the following examples. 

• VHDL Example 

signal Q3UT: STD_LOGIC_VECTOR (3 downto 0); 

CLX_1: BUFGS port map (I=>CLOCX_lN,0->CLCCK_0UT)J 

• Verilog Example 

begin: CPU_DATA 

Breaking Long Lines ot Code 

Break long lines of code at an appropriate point, such as at a comma, 
a colon, or a parenthesis to make your code easier to mad, as illus¬ 
trated in the following code fragments. 

• VHDL Example 

Ul: load_rcg pert map 
(INX->A, LOAD = >IX 1 , CLK=>5CLX, OU7X=>B) ; 

• Verilog Example 

load_rcg Ul 

I.XNXfA), .LOAD t IX'I , .CLK(SCLK), .0UTX<B)l; 


2-12 


Xitinx Development System 




HDL Coding Hints 


Adding Comments 

Add comments to your code to improve readability, reduce debug¬ 
ging time, and make it easier to maintain your code. 

• VHDL Example 

-- Read Counter 116-bit) 

— Updated 1-25-98 to add Clock Enable, John Doc 
— Updated 1-28-98 to add Terninal Count, Joe Cool 

process <RST, CLOCK, CE) 
begin 


• Verilog Example 

// Read Counter 116-bit) 

// Updated 1-25-98 to add Clock Enable, John Doe 
// Updated 1-28-98 to add Terninal Count, Joe Cool 

always Q (posedge R3T or posedge CLOCK) 
begin 


Using Stdjogic Data Type (VHDL only) 

The Sid logic (IEEE 1164) type is recommended for hardware 

descriptions for the following reasons. 

• It has nine different values that represent most of the states found 
in digital circuits. 

• Automatically initialized to an unknown value. This automatic 
initialization is important for HDL designs because it forces you 
to initialize your design to a known state, which is similar to 
what is required in a schematic design. Do not override this 
feature by initializing signals and variables to a known value 
when they are declared because the result may be a gate-level 
circuit that cannot be initialized to a known value. 

• Easy to perform a board-level simulation. For example, if you use 
an integer type for ports for one circuit and standard logic for 
ports for another circuit, your design can be synthesized: 
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however, you will need to perform time-consuming type conver¬ 
sions for a board-level simulation. 

Tire back-annotated netlist from Xilinx implementation is in 
Std logic. If you do not use Std logic type to drive your top-level 
entity in the testbench, you cannot reuse your functional testbench 
for timing simulation. Some synthesis tools can create a wrapper for 
type conversion between the two top-level entities; however, this is 
not recommended by Xilinx. 

Declaring Ports 

Xilinx recommends that you use the Std logic package for all entity 
port declarations. Tills package makes it easier to integrate the 
synthesized netlist back into the design hierarchy without requiring 
conversion functions for the ports. A VHDL example using the 
Std logic package for port declarations is shown below. 

Entity aiu is 

pore( A : in STD_LOGJC_VECTOR(3 downto 0>; 

E : in STD_LOGIC_VECTOB(3 downto 0>; 

CLK : in S7D_tOGXC; 

C : out STD_LCCIC_VECTCR(3 downto 0) ); 

end alu; 

Since the downto convention for vectors is supported in a back-anno¬ 
tated netlist. the RTL and synthesized netlists should use the same 
convention if you are using the same test bench. This is necessary 
because of the loss of directionality when your design is synthesized 
to an ED1F or XNF netlist. 

Minimizing the Use of Ports Declared as Buffers 

Do not use buffers when a signal is used internally and as an output 
port. In the following VHDL example, signal C is used internally and 
as an output port. 

Entity alu is 

port< A : in STD_LOGIC_VECTORdownto 0); 

B : in STD_LOGIC_VEC7OR (i downto 0>; 

CLK : in S7D_I/)GIC; 

C : buffer STD_LOGIC_VECTOR(3 downto 0» ); 

end alu; 

architecture BEHAVIORAL of alu is 
begin 
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process bcqin 

if (CLX'event end CLK='1'J then 

C <= UNSIGNED |A) * UNSIGNED IB) ♦ UNSIGNED 1C); 
end if; 
end process; 
end BEHAVIORAL; 

Because signal C is used both internally and as an output port, every 
level of hierarchy in your design that connects to port C must be 
declared as a buffer. However, buffer types are not commonly used in 
VHDL designs because they can cause problems during synthesis. To 
reduce the amount of buffer coding in hierarchical designs, you can 
insert a dummy signal and declare port C as an output, as shown in 
the following VHDL example. 

Entity alu is 

port( A : in STD_LOGIC_VECTOR(3 downto 0); 

B : in STD_LOGIC_VECTOR(3 downto 0); 

CLK : in STD_LOGIC; 

C : out STD_LOG IC_VECTOR < 3 downto 0)1; 
end alu; 

architecture BEHAVIORAL of alu is 
— dumny signal 

signal C_IN7 : STD_DOGIC_VECTOR(3 downto 0); 
begin 

C <= C_INT; 
process begin 

if (CLX'event and CLK='!'J then 
C_INT < =UNSIGNED<A) ♦ UNSIGNED(BJ + 

UNSIGNED <C_INT) ; 

end if; 
end process; 
end BEHAVIORAL; 

Comparing Signals and Variables (VHDL only) 

You can use signals and variables in your designs. Signals are similar 
to hardware and are not updated until the end of a process. Variables 
are immediately updated and, as a result, can effect the functioning of 
your design. Xilinx recommends using signals for hardware descrip- 
lions, however, variables allow quick simulation. 
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The following VHDL examples show a synthesized design that uses 
signals and variables, respectively. These examples are shown imple¬ 
mented with gates in the "Gate implementation of XOR.SIG" figure 
and the "Gale Implementation of XOR VAR" figure. 

Note: If you assign several values to a signal in one process, only the 
final value is used. When you assign a value to a variable, the assign¬ 
ment lakes place immediately. A variable maintains its value until 
you specify a new value. 

Using Signals (VHDL) 

— XQH.SIG.VHD 
— Hay 1997 
Library IEEE; 

use IEEE.std_logical 164.all; 

entity x^r.sig is 

port (A, B, C: an STD.LOGIC; 

X, V: out STD.LOGICI/ 
end xor.sigx 

architecture 5IG.APCH of uor.sig is 
signal D: jTD.LKIC; 
begin 

jIG:process |A,H, Cl 
begin 

D <= A; — ignored !! 

X <= C xor D; 

D <= B; — overrides !! 

V <= C xor D; 
end process; 
end SIG.ARCH; 


& 



Figure 2-1 Gale implemenlalion o( XOR_SIG 
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Using Variables (VHDL) 

— XOK_VAK.VHD 
— Hay 1997 

Library IEEE; 

use IEEE,std.loqrc.1164.all; 
use IEEE.std.logic.unsigncd.al1; 

entity xor.var is 

port (A, B, C: in STD.LOGIC; 

Xi Y: out STD.LOGIC); 

end xor.var; 

architecture VAX.ARCH of Jior.var is 
begin 


VAR:process lA,e r C| 

variable Dt STO.LOGICj 
begin 

D := A; 

X <= C xor D; 

D := B; 

Y <= C xor D; 
end process; 
end VAR_ABCH; 
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Figure 2-2 Gate Implementation of XOR_VAR 


Using Schematic Design Hints with HDL Designs 

Tins section describes how lo apply schematic entry design strategies 
to HDL designs. 
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Barrel Shifter Design 

Tlie schematic version of the barrel shifter design is included in the 
"Multiplexers and Barrel Shifters in XC3000/XC3100" application 
note (XAPP 026.001) available on the Xilinx web site at http:// 
www.xilinx.com. In this example, two levels of multiplexers are used 
to increase the speed of a 16-bit barrel shifter. This design is for 
XC3000 and XC3100 device families; however, it can also be used for 
other Xilinx devices. 

Tine following VHDL and Verilog examples show a 16-bit barrel 
shifter implemented using sixteen 16 —to —1 multiplexers, one for each 
output. A 16-to-l multiplexer is a 20-input function with 16 data 
inputs and four select inputs. When targeting an FPGA device based 
on 4-input lookup tables (such as XC4000 and XC3000 family of 
devices), a 20-input function requires at least five logic blocks. There¬ 
fore. the minimum design size is 80 (16 x 5) logic blocks. 

16 -bit Barrel Shifter (VHDL) 


— VHDL Mcdel tor a 16-bit Earrcl Shifter 
-- barrcl_.org. vhd 

-- !!!!!! i!!!!!!!!!!!!J!!!!!!!!!J!!!!!!!!!!!!!! 

— THIS EXAMPLE IS FOR COMPARISON ONLY 

— May 1997 

— USE barrel.vhd 


library IEEE; 

use IEEE.std_lcgic_l164.a 11; 
use IEEE.std_lcgic_arith.ail; 

entity barrel..org is 

port <S:in STD_LCGIC_VECTCR 13 downtc 0»; 

A_P:in 5TD_LOGIC_VECTOR I15 downtc 0); 

B_P:out 5TD_LOG 1C ..VECTOR |15 dovente 0) J ; 

end barrel..org; 

architecture RTL of barrelsorg is 
begin 

SHIFT: process (S, A_P) 
begin 

case S is 
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when *0000* => 

B_P <= A_P; 

when “ 0001 * => 

B_P<14 downto 0) <= A_P|15 downto 1); 

B_P<15) <= A_P 10 J ; 

when “ 0010 * => 

B_P<13 downto 0> <= A__P 11S downto 2 ) ; 

B_P{15 downto 14J <= A_P|1 downto 0); 

when “ 0011 * => 

B_Pi12 downto 0) <= A_P|15 downto 3); 

B.P{15 downto 13J <= A_P [2 downto 0); 

when “ 0100 * => 

B_P{11 downto 0) <= A_P|15 downto 4) ; 

B_P<15 downto 12) <= A_P|3 downto 0); 

when “ 0101 * => 

B_P <10 downto 0) <= A_P|15 downto 5); 

B_P<15 downto 11) <= A_P|4 downto 0); 

when “ 0110 * => 

B_P{9 downto 0) <= A_PI15 downto 6); 

B_P{15 downto 10) <= A_P|5 downto 0); 

when “ 0111 * => 

B_P<S downto 0) <= A_P|15 downto 7); 

B_P <15 downto 9> <= A_P|6 downto 0); 

when “ 1000 * => 

B__P {7 downto 0) <= A_P)15 downto 9) ; 

B_P<15 downto 8) <= A_P |7 downto 0); 

when “ 1001 * => 

B_P<6 downto 0) <= A_P<15 downto 9); 

B_P<15 downto 7) <= A_P<8 downto 0); 

when “ 1010 * => 

B_P{5 downto 0) <= A_P<15 downto 10); 

B.P{15 downto 6) <= A_P<9 downto 0) ; 

when “ 1011 * => 

B_P <4 downto 0) <= A_P<15 downto 11); 
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B_P|15 downto 5) <= A.P<10 downto 0); 

when “ 1100 " => 

B.P{3 downto 0) <= A.P<15 downto 12); 

B_P|15 downto 4) <= A.P<11 downto 0); 

when “ 1101 " => 

B.P <2 downto 0) <= A_P(15 downto 13); 

B.P <15 downto 3) <= A_P(12 downto 0) ; 

when * 1110 " => 

B.P(1 downto 0) <= A.P(15 downto 14); 

B.P <15 downto 2) <= A.P(13 downto 0) ; 

when * 1111 " => 

B.P <0) <= A_P(15); 

B.P <15 downto 1) <= A_P(14 downto 0) ; 

when others => 

B_P <= A.P; 
end cose; 

end process; — End SHIFT 
end RTL; 

16 -bit Barrel Shifter (Verilog) 

/////////////////////////////////////////////////// 

// BARHEL.ORG.V Version 1.0 // 

// Xilinx HDL Synthesis Design Guide // 

// Uncptimixed model for a 16-bit 3arrel Shifter // 

// THIS EXAMPLE IS FOR COMPARISON CNLY // 

// Use BARREL.V // 

// January 1998 // 

/////////////////////////////////////////////////// 

module barrel.org {S, A_P r B.P); 

input (3:01 S; 
input (15:01 A.P; 
output (15:01 B.P; 

reg (15:01 B.P; 

always £ <A.P or S) 
begin 
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case <5) 

4*bOOOO : // Shift by 0 
begin 

B_P <= A_P; 
end 


4'b0001 : // Shift by 1 
begin 

B_Pfl51 <= A_P(01; 
B_Pfl4:0| <= A_P(1$:1|; 
end 


4'b0010 : // 
begin 
B_P[15:141 
B_P[13:0| 
end 

4'bOOll : // 
begin 
B_Pfl5:13) 
B_P[12:01 
end 

4'b0l00 : // 
began 
B_P(15:121 
B_P|li:01 
end 


Shift by 2 

<- A.P|i:0|; 

<= A_P11S:2); 

Shift by 3 

<= A_P|2:01? 

<« A_P|15:3]; 

Shift by 4 

<= A_Pf3:0j; 
<= A_P(15:4); 


4'b0101 : // Shift by 5 
began 

B_P115:111 <= A_P(4:0); 
B_P|10:01 <= A_P(15:5); 

end 


4'b0110 : // Shift by 6 
began 

B_P|1S:10) <= A_P[5:0); 
B_P19:01 <3 A_P(15:6); 

end 


4'bOlll : // Shift by 7 
begin 

B_P|1S:91 « A_P(6:0)j 
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B_P10:G1 <= A_P(i5:71j 

end 

4•bl COO : // Shift by 0 
begin 

B_P|15:ftl <= A_p(7 : 0); 
B_P!7:01 <= A_P(15:8); 

end 

4'bl001 : // Shift by 9 
begin 

B_P115:71 <= A_P[8:0); 

B_PI6:01 <= A_P f15:9] ; 

end 

4'blOlO : // Shift by 10 
begin 

B_P11S:61 <= A_P(9:0); 

B_P15:01 <= A_P[15:101; 

end 

4'bl011 : // Shift by 11 
begin 

B_P115:Si <= A_P110:0); 
B_P14:01 <= A_P115:11); 

end 

4'bl100 : // Shift by 12 
begin 

B_P ! 15:41 <= A_PU1:0); 
B_P13:0J <= A_P115:12); 

end 

4'bl101 : // Shift by 13 
begin 

B_P|15:3) <= A_P(12:0); 

B_P12:01 <= A_P f15:131; 

end 

4*bl110 : // Shift by 14 
begin 

B_P115:21 <= A_P f13:0]; 

B_P|1:0] <= A_Pfl5:141; 

end 

4'bllll : // Shift by 15 
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begin 

B_P11S:11 <= A_P(14:0]j 

B_P|Q| <= A_p(151; 
end 

default : 

B_P <= A_P; 

endcase 

end 


The following modified VHDL and Verilog designs use two levels of 
multiplexers and are twice as fast as the previous designs. These 
designs am implemented using 32 4-to-l multiplexer* arranged in 
two levels of sixteen. The first level rotates the input data by 0,1, 2. or 
3 bits and the second level rotates the data by 0,4. 8. or 12 bits. Since 
you can build a 4-to-l multiplexer with a single CLB. the minimum 
size of this version of the design is 32 (32 x 1) CLBs. 

16 -bit Barrel Shifter with Two Levels of Multiplexers 
(VHDL) 

— BARREL.VHD 

— Based cn XAPP 26 (see http://www.xilmx.com) 

— 16 -bit barrel shifter (shift right) 

— May 1997 

library IEEE; 

use IEEE.std_lcgic_l164.all; 

use IEEE.std_lcgic_arith.all; 

entity barrel is 

port (S: in 3TD_LCGIC_VECTOR<3 downto 0); 

A_P: in S7D_l/DGIC_VECTOR 115 downto 0) ; 

B_P: out 5TD_LCG IC_VECTOR(15 downto 0)); 

end barrel; 

architecture RTL of barrel is 

signal SEL1 , SEL2 : STD_I/)GIC — VECTOR 11 downto 0); 

signal C: STD_lOGIC_VECTOR 115 downto 0); 

begin 

FIRST_LVL: process <A_P, SELl) 
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begin 

case SELl is 

when “00" => — Shift, by 0 
C <= A_P; 

when *01" => — Shift, by 1 
Cl 15) <= A_P(0); 

C|14 downto 0) <= A_PI15 downto 1>; 

when “10" => ~ Shift, by 2 

C|15 downto 14J <= A_P(1 downto 0>; 
C| 13 downto 0) <= A_P115 downto 2>; 

when “11" => ~ Shift by 3 

C|15 downto 13I <= A_P (2 downto 0); 
Cl 12 downto 0) <= A_P115 downto 3>; 

when others => 

C <= A_P; 
end ease; 

end process; --End FIR3T_LVL 

SECMD_LVL: process (C, SEL2) 
begin 

case SEL2 is 

when *00" => --Shift by 0 
B_P <= C; 

when *01" => --Shift by 4 

B_P<15 downto 12) <= C(3 downto 0); 
B_P<11 downto 0) <= C(15 downto 4); 

when *10" => --Shift by fi 

B_P<7 downto 0) <= C(15 downto 9); 

B_P(15 downto 8> <= C(7 downto 0>; 

when *11" => --Shift by 12 

B_P(3 downto 0) <= C|15 downto 12); 

B_P(15 downto 4) <= C|ll downto 0); 

when others => 

B_P <= C; 
end ease; 

end process; -- End SECOND_LVL 
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3ELI <= S<1 downto Ol ; 
3EL2 <= S{3 downto 2); 


end rtl; 

16 -bit Barrel Shifter with Two Levels of Multiplexers 
(Verilog) 


• BARREL.V 

• XAPP 26 http://wMvc.xiiinx.com 

• 16 -bit barrel shifter (shift right) 

• May 1997 


module barrel (S, A_P, B_P); 

input (3:0) S; 
input (15:0) A_P; 
output (15:0) B_P; 

reg (15:0) B_P; 

wire (1:0) SELl, SEL2; 
reg (15:0) C; 

qn SELl = 3(1:0); 

Lgn SEL2 - 3(3:2); 


always 0 <A_P or SELl> 
begin 

case {SELl) 

2 'bOQ : // Shift by 0 
begin 
C <= A_P; 
end 

2 'b01 : // Shift by 1 
begin 

C1151 <= A_P(0); 

C[14:0] <= A_P(15:1); 
end 


2'blO : // Shift by 2 
begin 
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C [ 15:14 ] <= JV_p(1:0)| 

C[13:0] <= A_P[15:2); 
end 


2 'bll 

: // Shift by 3 
begin 

C[15:13] <= A_P(2:0]; 

C [ 12:0] <= A_P[15:3]; 
end 


default 

endcasc 

end 

C <= A_P; 


always >} {C or 

SE12) 


begin 

case \SEL2) 



2'bflO : 

// Shift by 0 
begin 

B—P <= C; 

end 


2 'bOi : 

// Shift by 4 
begin 

B_P(15:12] <= C[3:0]; 

B_P(11:01 <= C[15:4]; 

end 


2* blO : 

// Shift by & 
begin 

B_P(7:0] <= C(15]; 

B_P(15:01 <= CI7:0]; 

end 


2 'bll : 

// Shift by 12 
begin 

B_P[3:0] <= C(15:12)j 

B_P(15:4] <= CI11:0]; 
end 


default : 

endcasc 

end 

B_P <= C; 
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cndncdulc 

When these two designs are implemented in an XC4005E-2 device 
with a popular synthesis tool, them is a 64% improvement in the gate 
count (88 occupied CLBs reduced to 32 occupied CLBs) in the 
barrel.vhd design as compared to the barrel org.vhd design. Addi¬ 
tionally, there is a 19% improvement in speed from 35.58 ns (5 logic 
levels) to 28.85 ns (4 logic levels). 

Implementing Latches and Registers 

Synthesizers infer latches from incomplete conditional expressions, 
such as an If statement without an Else clause. This can be problem¬ 
atic for FPGA designs because not all FPGA devices have latches 
available in the CLBs. In addition, you may think that a register is 
created, and the synthesis tool actually created a latch. The 
XC4000EX/XL and XC5200 FPGAs do have registers that can be 
configured to act as latches. For these devices, synthesizers infer a 
dedicated latch from incomplete conditional expressions. XC4ftX)E. 
XC3100A, XC3000A, and Spartan devices do not have latches in their 
CLBs. For these devices, latches described in RTL code are imple¬ 
mented with gates in the CLB function generators. For XC4000E or 
Spartan devices, if the latch is directly connected to an input port, it is 
implemented in an lOB as a dedicated input latch. For example, the D 
latch described in the following VHDL and Verilog designs is imple¬ 
mented with one function generator as shown in the "D Latch Imple¬ 
mented with Gates" figure. 

D Latch Inference 

• VHDL Example 

— D_LATCH.VHD 
— Hay 1997 

library IEEE; 

use IEEE.std_lcqic_1164.all; 

entity d_latch is 

port (GATE, DATA: in STD_LOGIC; 

0 : out 3TD_LCGIC); 
end d_iatch; 
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architecture BEHAV of d_iatch is 
begin 

LATCH: process (GATE, DATA! 
begin 

if (GATE = '1') then 
Q <= DATA; 
end if; 

end process; -- end LATCH 

end BEHAV; 

• Verilog Example 

/• Transparent High Latch 

• D_LATCH.V 

* May 1997 »/ 

module d_latch (GATE, DATA, Q I; 

input GATE; 
input DATA; 
output 0 ; 

reg Q; 

always & (GATE or DATA) 
begin 

if (GATE -= l'bl) 

0 <= DATA; 
end // End Latch 

endncdule 


D Latch 



x*m 

Figure 2-3 D Latch Implemented with Gates 
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In this example, a combinatorial loop results in a hold-time require¬ 
ment on DATA with respect to GATE. Since most synthesis tools do 
not process hold-time requirements because of the uncertainty ol 
routing delays, Xiiinx does not recommend implementing latches 
with combinatorial feedback loops. A recommended method for 
implementing latches is described in this section. 

To eliminate this possible problem, use D registers instead of latches. 
For example, to convert the D latch to a D register, use an Else state¬ 
ment or modify the code to resemble the following example. 

Converting a D Latch to a D Register 

• VHDL Example 

— DEREGISTER.VHD 
— May 1997 

— Changing Latch into a D-Register 
library IEEE; 

use IEEE.std_lcgic_1164.all; 

entity d_rcgistcr is 

port (CLX, DATA: in STD_LOGIC; 

0 : out STDeLCGICJ; 
end deregister; 

architecture BEHAV cf d_rcqistcr is 
begin 

MV_D_REG: process <CLK, DATA) 
begin 

if (CLX* event and CLK^'l'J then 
0 <= DATA; 
end if; 

end process; --End MY_D_REG 
end BEHAV; 


• Verilog Example 

/* Changing Latch into a D-Registcr 

• D_REGI5TER.V 

• May 1997 */ 

module deregister <CLK, DATA, Q) ; 
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input clkj 
input DATA; 
output 0 ; 

teg 0 ; 

always £ (posedge CLK) 
begin: My_C_Rcg 
0 <= DATA; 
end 

cndncdulc 

With Mime synthesis tools you can determine the number of latches 
that are implemented in your design. Check the manuals that came 
with your software for information on determining the number of 
latches in your design. 

You should convert all If statements without corresponding Else 
statements and without a clock edge to registers. Use the recom¬ 
mended register coding styles in the synthesis tool documentation to 
complete this conversion. 

In XC4000E devices, you can implement a D latch by instantiating a 
RAM 16x1 primitive, as illustrated in the following figure. 


PAH 16X1 



CM) 


>020 

Figure 2-4 D Latch Implemented by Instantiating a RAM 

In all other cases (such as latches with reset/set or enable}, use a D 
flip-flop instead of a latch. This rule also applies to JK and SR 
flip-flops. 
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The following table provides a comparison of area and speed for a D 
latch implemented with gates, a 16x1 RAM primitive, and a D 
flip-flop. 

Table 2-1 D Latch Implementation Comparison 


Comparison 

Spartan, 

XC4000ECLB 

Latch 

Implemented 
with Gates 

XC4000EX' 

XLXV, 

XC5200CLB 

Latch 

All Spartan 
and XC4000 
Input Latch 

XC4000 
EEXXLXV 
Instantiated 
RAM Latch 

All 

Families 

D Flip 
Flop 

Advantages 

RTL HDL 
infers latch 

RTL HDL 
infers latch, 
no hold times 

RTL HDL 
infers latch, 
no hold 
times (if not 
specifying 
NODE 1-AY, 
saves CLB 
resources) 

No hold time 
or combina¬ 
torial loops, 
best for 
XC4000E 
when latch 
needed in 

CLB 

No hold 
time or 
combina¬ 
torial loop. 
FPC.Asare 
register 
abundant. 

Disadvantages 

Feedback loop 
results in hold 
time require¬ 
ment, not 
suggested 

Not available 
in XC4000E 
or Spartan 

Not avail¬ 
able in 
XC5200, 
input to 
latch must 
directly 
connect to 
port 

Must be 
instantiated, 
uses logic 
resources 

Requires 
change in 
code to 

convert 

latch to 
register 

■I 

1 Function 

Generator 

1 CLB 
Register/ 

Latch 

1 lOB 

Register/ 

Latch 

1 Function 
Generator 

1 CLB 

Register/ 

Latch 


J Arc* Is I hr number of fimrllcn generator* and registers required. XCMOO and Spartan CLBs have two 
furartion generators and livo registers; XC5200 CLBs have four function generators and four register/ 
latches. 


Resource Sharing 

Resource sharing is an optimization technique that uses a single func¬ 
tional block (such as an adder or comparator) to implement several 
operators in the HDL code. Use resource sharing to improve design 
performance by reducing the gate count and the routing congestion. 
If you do not use resource sharing, each HDL operation is built with 
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separate circuitry. However, you may want to disable resource 
sharing lor speed critical paths in your design. 

Tine following operators can be shared either with instances of the 
same operator or with an operator on the same line. 

4 

+ — 

>>=<<= 

For example, a -t operator can be shared with instances of other * 
operators or with - operators. A ' operator can be shared only with 
other' operators. 

You can implement arithmetic functions (+, magnitude compara¬ 
tors) with gates or with your synthesis tool's module library. Tire 
library functions use modules that take advantage of the carry logic 
in XC4<XK) family, XC5200 family, and Spartan family CLBs. Carry 
logic and its dedicated routing increase the speed of arithmetic func¬ 
tions that are larger than 4-bits. To increase speed, use the module 
library if your design contains arithmetic functions that are larger 
than 4-bits or if your design contains only one arithmetic function. 
Resource sharing of the module library automatically occurs in most 
synthesis tools if the arithmetic functions are in the same process. 

Resource sharing adds additional logic levels to multiplex the inputs 
to implement more than one function. Therefore, you may not want 
to use it for arithmetic functions that are part of your design's time 
critical path. 

Since resource sharing allows you to reduce the number of design 
resources, the device area required for your design is also decreased. 
Tire area that is used for a sliared resource depends on the type and 
bit width of the shared operation. You should create a sliared 
resource to accommodate the largest bit width and to perform all 
operations. 

If you use resource sliaring in your designs, you may want to use 
multiplexers to transfer values from different sources to a common 
resource input. In designs that have sliared operations with the same 
output target, the number of multiplexers is reduced as illustrated in 
the following V’HDL and Verilog examples. The HDL example is 
shown implemented with gates in the "Implementation of Resource 
Sliaring" figure. 

• VHDL Example 


2-32 


Xitinx Deiv/o/mien t System 




HDL Coding Hints 


-- RES_SHARI W3.VHD 
-- Hay 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic__unsigncd.all; 
use IEEE.std_lcgic_ar ith. ail ; 

entity res_sharing is 

port (Al,Bl,Cl,Dl: in STD_LCGIC_VECTOR <7 dewnto 0); 
CCND_1: in STD_LOGIC; 

: out S7D_I/>GIC_VECTOR (7 downto 0)\; 
end res.sharing; 

architecture BEHAV cf res_sharing is 
begin 

Pi: process <Al r Bl, Cl, Dl, COND_l) 
begin 

if <COND_l='l'J then 
Z1 <= Al ♦ Bl; 

else 

Z1 <= Cl ♦ Dl; 
end if; 

end process; — end Pi 
end BEHAV; 

• Verilog Example 

/* Resource Sharing Example 

• res_shar:wg.v 

* Hay 1997 V 

module res_sharing (Al, Bl, Cl, Dl, CC«D_1, Zl); 

input COND_l; 

input [7:01 Al, Bl, Cl, Dl; 

output (7:01 Zl; 

reg (7:01 Zl; 

always &<Al or Bl or Cl or Dl or COMD_l) 
begin 

if (COND.l) 

Zi <= Al ♦ Bl; 

else 
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end 

cndncdule 


1 <= Cl » Dlj 


If you disable resource sharing or if you code the design with the 
adders in separate processes, the design is implemented using two 
separate modules as shown in the "Implementation without 
Resource Sharing" figure. 
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Figure 2-5 Implementation ot Resource Sharing 
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Some synthesis took generate modules from special Xilinx module 
generation algorithms. Generally, this module generation is used for 
operators such as adders, subtracters, incrementers, decremented, 
and comparators. Tire following table provides a comparison of the 
number of CLBs used and the delay for the VHDL and Verilog 
designs with and without resource sharing. 

Table 2-2 Resource Sharing.No Resource Sharing Comparison 
for XC4005EPC84-2 


Comparison 

Resource 
Sharing with 
Xilinx Module 
Generation 

No Resource 
Sharing with 
Xilinx Module 
Generation 

Resource 
Sharing 
without Xilinx 
Module 
Generation 

No Resource 
Sharing 
without Xilinx 
Module 
Generation 

F/G Functions 

24 

24 

19 

28 

H Function 
Generators 

0 

0 

11 

8 

Fast Carry Logic 
CLBs 

5 

10 

0 

0 

Longest Delay 

27.878 ns 

23.761 ns 

47.010 ns 

33386 ns 

Advantages/ 

Disadvantages 

Potential for 
area reduction 

Potential for 
decreased crit¬ 
ical path delay 

No carry logic 
increases path 
delays 

No carry logic 
increases CLB 

count 


Note: Refer to the appropriate reference manual for more informa¬ 
tion on resource sharing. 


Gate Reduction 

Use the generated module components to reduce the number of gates 
in your designs. The module generation algorithms use Xilinx carry 
logic to reduce function generator logic and improve routing and 
speed performance. Further gate reduction can occur with synthesis 
tools that recognize the use of constants with the modules. 
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Preset Pin or Clear Pin 

Xtlinx FPGAs consist of CLBs that contain function generators and 
flip-flops. The XC4000 family and Spartan family flip-flops have a 
dedicated clock enable pin and either a clear (asynchronous reset) pin 
or a preset (asynchronous set) pin. All synchronous preset or clear 
functions can be implemented with combinatorial logic in the func¬ 
tion generators. 

Tine XC3000 family and XC5200 family FPGAs have an asynchronous 
reset pin on the CLB registers. An asynchronous preset can be 
inferred, but is built by connecting one inverter to the D input and 
connecting a second inverter to the Q output of a register. In this case, 
an asynchronous preset is created when the asynchronous reset is 
activated. This may require additional logic and increase delays. If 
possible, the inverters are merged with existing logic connected to the 
register input or output. 

You can configure FPGA CLB registers to have either a preset pin or a 
clear pin. You cannot configure the CLB register for both pins. You 
must modify any process that requires both pins to use only one pin 
or you must use three registers and a mux to implement the process. 
If a register is described with an asynchronous set and reset, your 
synthesis tool may issue an error message similar to the following 
during the compilation of your design. 

Warning: Target library contains no replacement for 
register *0_rcg' <*‘FFGEN•*) . ITRANS-4) 

Warning: Cell *Q_reg' (••FFGEN**) not translated. 

(TRAMS-1) 

During the implementation of the synthesized netlist, NGDButld 
issues the following error message. 

ERROR:basnu - logical block "Q_reg" of type "_FFGEN_” 
is unexpanded. 

An XC4000 CLB is shown in the following figure. 
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CLB 


ti ti a » 



••u 


Figure 2-7 XC4000 Configurable Logic Block 

The following VHDL and Verilog designs show how lo describe. 
register with a clock enable and either an asynchronous preset or 
clear. 

Register Inference 

• VHDL Example 

— FF_EXAKPLE. VHD 
— May 1997 

-- Exanplc of Implementing Registers 
library IEEE; 

use IEEE.std_lcgic_l164.all; 
use IEEE.std_lcgic_unsigned.all; 

entity ff_cxamplc is 

port < RESET, CLCCK, ENABLE: in STD_LOCI . 

D_IN: in STD_LOGIC_VECTOR |7 dewnto 0) ; 
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a_q_cot 
B_Q_CCT 
c_q_cct 
D_Q_CUT: 
end f£_example; 


out STD_U)G IC_VECTOR 
cut STD_IOGIC_VECTOR 
out STD_LOGIC_VECTOR 
out STD_LGG I C__ VECTOR 


(7 downto 0) ; 
(7 downto 0) ; 
(7 downto 0) ; 
<7 downto 0)) 


architecture BEHAV of ff_cxample is 
begin 


— D flip-flop 

FF: process ICLCCK) 
begin 

if |CLOCK'event and CLOCK*** 1') then 
A_Q_Gt7T <= D__ IN; 
end if; 

end process; — End FT 

— Flip-flop with asynchronous reset 
FF_ASYNC_RESET: process {RESET, CLOCK) 
begin 

if I RESET s '!') then 

B_0_QUT <= *00000000"; 
elsif (CLOCK*event and CLOCK=*1 *) then 
B_Q_GOT <= D_IN; 
end if; 

end process; — End FF_ASYNC_RESET 

— Flip-flop with asynchronous set 
FF_ASYNC_SET: process (RESET, CLOCK) 
begin 

if (RESET = *i*) then 

C_Q_CUT <= *11111111"; 
elsif (CLOCK*event and CLOCK = •1 *) then 
C_Q_OtfT <= D__ IN; 
end if; 

end process; -- End FF_ASYNC_SET 

-- Flip-flop with asynchronous reset and clock 
enable 

FF_CLCCK_ENABLE: process (ENABLE, RESET, CLCCK) 
begin 

if (RESET = *1*) then 

D_Q_OUT <= "00000000"; 
elsif (CLOCK * event and CLOCK=' 1') then 
if (ENABL£='1"> then 
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D_Q_COT <= D_I U ; 
end if; 
end if; 

end process; — End FF_CLOCK_ENABLE 
end BEHAV; 

• Verilog Example 

/* Exanplc of Implementing Registers 

• FF_EXAMP LE. V 

* Hey 1997 */ 


module ff.cxample {RESET, CLOCK, ENABLE, D_IN, 

A_Q_OUT, B_Q_0 JT, C_0-OUT, D_Q_OUTJ 

input RESET, CLOCK, ENABLE; 
input 17:01 D_XN; 

output 17:01 A_0_OUT; 

output 17 ;0i B_Q_OUT; 

output 17:01 C_Q_OUT; 

output 17:01 D_Q_OUT; 


reg 

reg 

reg 

reg 


17:01 A_0_OUT; 
17:01 B_0_OUT; 
17:01 C_Q_OUT; 
17:01 D_Q_OUT; 


ft D flip-flop 

always £<poscdge CLOCK) 

begin 

A_Q_OUT <= D_IN; 

tsnd 


ft Flip-flop with asynchronous reset 
always &(posedge RESET or posedge CLCXTK) 
begin 

if IRESET) 

B_0_<X7T <= fc'bOOOOOOOO; 
else 

B_Q_COT <= D_IN; 

end 


ft Flip-flop with asynchronous set 
always £(posedge RESET or posedge CLCCK) 
begin 
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it IRESET) 

C_Q_CJT <= &'bllllllll; 
else 

C_Q_OUT <= D_IN; 

ond 

//Flip-flop with asynchronous reset & clock enable 

always $(pcscdgc RESET cr posedge CLCXTK) 

begin 

if (RESET) 

D_Q_CUT <= bOOOOOOOO; 
else if (ENABLE) 

D_Q_OUT <= D_IN; 

end 

endncdule 

Using Clock Enable Pin Instead ot Gated Clocks 

Use the CLB clock enable pin instead of gated docks in your designs. 
Gated clocks can introduce glitches, increased clock delay, clock skew, 
and other undesirable effects. The first two examples in this section 
(VHDL and Verilog) illustrate a design that uses a gated clock. The 
"Implementation of Gated Clock" figure shows this design imple¬ 
mented with gates. Following these examples are VHDL and Verilog 
designs that show how you can modify the gated clock design to use 
the dock enable pin of the CLB. The "Implementation of Clock 
Enable” figure shows this design implemented with gates. 

• VHDL Example 


— GATE_CLOCX.VHD Version 1.1 

-- Illustrates clock buffer control — 

— Better inplcmentaticn is to use — 

-- clock enable rather than gated clock — 

— May 1997 


library IEEE; 

use IEEE.std_lcgic_ll64.all; 
use IEEE.std_lcgic_unsagncd.all; 

entity gate_clock is 

port (INI,IN2,DATA,C1K,lOAD: in STD_I/3GIC; 
OUTl: out S7D_LOGIC); 
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end gatc_clock; 

architecture BEHAVIORAL of gatc_clock is 

signal GATECLK: STD_LOGIC; 

begin 

GATECLK <= (INI and IN2 and CLK); 

GATE_PR: process IGATECLK,DATA, LOAD) 
begin 

if IGATECLK*event and GATECLK='l') then 
if <LGAD='l') then 
OUTl <= DATA; 
end if; 
end if; 

end process; —End GATE_PR 

end BEHAVIORAL; 

• \ r erilog Example 

//////////////////////////////////////// 


// GATE_CLOCK.V Version 1.1 // 

// Gated Clock Example // 

// Better implerrentation to use clock // 

// enables than gating the clock // 

// May 1997 // 


////////////////////////////////////////// 

module gatc_clcck(INI, IN2, DATA, CLK, LOAD, OUTl) ; 

input INI ; 

input IN 2 ; 

input DATA ; 

input CLK ; 

input UDfiD ; 

output OUTl ; 

reg OUTl ; 

wire GATECLK ; 

assign GATECLK = (INI & IN2 i CLK); 
always Q <posedge GATECLK) 
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begin 

it' | l/JAD == rbu 
OT1 = DATA; 

end 

endncdule 



AND3 


X8628 

Figure 2-8 Implementation of Gated Clock 

• VHDL Example 

— CLOCX_ENABLE.VHD 
— Hay 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_unsigncd.al1; 

entity clocksenable is 

port (INI,IN2,DATA,CLOCK,LOAD: inSTD_LOGIC; 
EXX7T: out STD — LOGIC) ; 
end clock_cnablc; 

architecture BEHAV cf clock_enable is 

signal ENABLE: STD_LOGIC; 

begin 


ENABLE <= INI and IN2 and LOAD; 

EN_P R: process (ENABLE,DATA,CLOCK) 
begin 

it (CLOCK'event and CLOCK=M / ) then 
if <ENABLE='1'J then 
DOUT <= DATA; 
end if; 
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end if; 

end process; -- End EN_PR 


end BEHAV; 

• Verilog Example 

/* Clock enable cxarrplc 

• CLOCX_ENABLE.V 

* Hay 1997 *t 

module clock_cnablc (INI, IN2, DATA, CLK, U1AD, DOUT); 

input INI, IN2, DATA; 
input CLK, LOAD; 
output COUT; 

wire ENABLE; 
req DOUT; 

assign ENABLE = INI i XN2 & LOAD; 


always 

begin 


jnd 


£ (pcscdgc CLK) 

if (ENABLE) 

TOUT <= DATA; 


endncdule 



Figure 2-9 Implementation of Clock Enable 

Using If Statements 

The VHDL syntax for If statements is as follows: 

if condition then 

icquonco_of_statomcnti; 
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(olsif condition then 

scqucnce_of_statcments;) 

else 

scqucncc..of — statements; 
end if; 

The Verilog syntax for If statements is as follows: 

if (condition) 
begin 

sequence of statements; 

end 

(else if (condition) 
begin 

sequence of statements; 

end) 

else 

begin 

sequence of statements; 

end 

Use If statements to execute a sequence of statements based on the 
value of a condition. The If statement checks each condition in order 
until the first true condition is found and then executes the state¬ 
ments associated with that condition. After a true condition is found 
and the statements associated with that condition are executed, the 
rest of the If statement is ignored. If none of the conditions are true, 
and an Else clause is present, the statements associated with the Else 
are executed. If mine of the conditions are true, and an ELse clause is 
not present, none of the statements are executed. 

If the conditions are not completely specified (as shown below), a 
latch is inferred to hold the value of the target signal. 

• VHDL Example 

if (L = M'J then 
Q <= D; 
end if; 

• Verilog Example 

if (L==l'bl) 

Q-D; 
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To avoid a latch inference, specify all conditions, as shown here. 

• VHDL Example 

if (L - * 1 ') then 
0 <= Dj 

else 

a <= 'O'; 

end if; 


• Verilog Example 

if (L==l'bl) 
0-D; 

else 


Using Case Statements 

The VHDL syntax for Case statements is as follows. 

case expression is 

when choices -> 

Isoquonce_of_statemontsj ) 

(when choices -> 

Isoquence_cf_statements;)) 
when others => 

Isequenco_of_statoments;) 

end case; 

The Verilog syntax for Case statements is as follows. 

case (expression) 

choices: statement; 

(choices: statement; ) 
default: statement; 

endcase 

Use Case statements to execute one of several sequences of state¬ 
ments, depending on the value of the expression. When the Case 
statement is executed, the given expression is compared to each 
choice until a match is found. The statements associated with the 
matching choice are executed. The statements associated with the 
Others (VHDL) or Default (Verilog) clause are executed when the 
given expression does not match any of the choices. The Others or 
Default clause is optional, however, if you do not use it, you must 
include all possible values for expression. For clarity and for 
synthesis, each Choices statement must have a unique value for the 
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expression. If possible, pul the most likely Cases first to improve 
simulation speed. 

Using Nested If Statements 

Improper use of the Nested If statement can result in an increase in 
area and longer delays in your designs. Each If keyword specifies 
priority-encoded logic. To avoid long path delays, do not use 
extremely long Nested If constructs as shown in the following 
VHDL/Verilog examples. These designs are shown implemented in 
gates in the ' Implementation of Nested If" figure. Following these 
examples are VHDL and Verilog designs that use the Case construct 
with the Nested If to more effectively describe the same function. The 
Case construct reduces the delay by approximately 3 ns (using an 
XC4005E-2 part). The implementation of this design Ls shown in the 
"Implementation of If-Case" figure. 

Inefficient Use of Nested If Statement 

• VHDL Example 

— NESTED.IF.VHD 

— Hay 1997 

Library IEEE; 

use IEEE.3TD_LCGIC_1164.all; 
use IEEE.3TD_LCGIC — UNSIGNED.all; 
use IEEE.3TD_LCGIC_ARITH.all; 

entity ncsted__if is 

port (ADDR_A: in std_logic_vector (1 dovrnto 0); — ADDRESS Code 


ADDR_B: 

in 

st d_ 1 og ic_vcctor 

u 

downto 

Oil — 

ADDRESS 

Code 

ADDR_C: 

in 

std_ log ic.jvector 

u 

dewnto 

Oil — 

ADDRESS 

Code 

ADDR_D: 

in 

s t d_ 1 og ic __ vector 

u 

downto 

Ot; — 

ADDRESS 

Code 

RESET: 

in 

s t d_logic; 






CLK : 

in 

s t d_logic; 






DEC_Q: 

out 

std_logic_vcctor 

<5 

dewnto 

0|); - 

- Decode 

OUTPUT 


end ncstcd_if; 

architecture xilinx of nested_if is 
begin 


- NESTED_IF PROCESS 

NESTED_IF: process ICLK) 
begin 
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if (CLK'event and CLK = '1') then 
if IRESET = ' O' ) then 

if <ADDR_A = “00") then 

DEC_Q|5 downto 4) <= ADDR_D; 

DEC_013 downto 2) <= "01"; 

DEC.OU downto 0) <= "00"; 

if IADDR_B = "01") then 

DEC_C(3 downto 2) <= unsigned<ADDR_A) ♦ * 1 *; 
DEC_Q<1 downto 0) <= unsigned<ADDR_B) ♦ '1'; 
if <ADDR_C = “10") then 

D£C_Q 15 downto 4) <= unsigned <ADDR__D) * '1'; 
if (ADDR_D = "11") then 

DEC_0<S downto 4) <= “00"; 
end if; 

else 

DEC_Q|5 downto 4) <= ADDR_D; 
end if; 
end if; 

else 

DEC_Q<S downto 4) <= ADDR_D; 

DEC_Q(3 downto 2) <= ADDR_A; 

DEC^Qd downto 0) <= unsigned (AEDR_B) *1'; 

end if; 

else 

DEC_Q <= “000000"; 
end if; 
end if; 
end process; 
end xilinx; 

• Verilog Example 

//////////////////////////////////////// 

// N£STED_IF.V // 

// Nested If vs. Case Design Exanplc // 

// August 1997 * // 

//////////////////////////////////////// 

module nested_if <ADDR_A, ADDR_B, ADDR_C, ADDR_D r RESET, CLK. D£C_QJ; 

input (1:0) ADDR_A ; 
input (1:0) ADDR_B ; 
input (1:0) ADDR_C ; 
input (1:0) ADDR_D ; 
input RESET, CLK ; 

output (5:0) DEC_0 ; 
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rcg .'5:0) DEC_Q ; 

// Nested If Process // 
always £ {posedge CLK) 
begin 

if (RESET == l'bi) 
begin 

if <ADDR_A == 2'b00) 
begin 

D£C_Q{5:4) <= ADDR_D; 

DEC_Q(3:2] <= 2'b01; 

D£C_Qtl:0) <= 2'b00; 
if (ADDR_B == 2* b0i> 
begin 

DEC_£[3:2| <= ADDR_A 4 l'bl; 
DEC«O[l:01 <= ADDR_B 4 l'bl; 
if <ADDR_C == 2'blO) 
begin 

DEC_Q(5:41 <= ADDR_D 4 l'bl; 
if (ADDR_D -= 2'bl1> 

DEC„01S:4) <= 2 r bOO; 

end 

else 

DEC_Q(5:41 <= ADDR_D; 

end 

end 

else 

DEC_0(5:4] <= ADDR_D; 

DEC_0(3:2] <= ADDR_A; 

DEC_Q(1:0) <= ADDR_B 4 l'bl; 

end 

else 

DEC_0 <= 6'b000000; 

end 

endncdule 
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Nested H Example Modified to Use If-Case 

Note: In the following example, the hyphens ("don't cares") used for 
bits in the Case statement may evaluate incorrectly to false for some 
synthesis tools. 

• VHDL Example 

— rE.CASE.VHD 

— Hay 1997 


Library IEEE; 

use IEEE. 3TD.LCGIC.il 64.all* 
use IEEE.3TD_LCGrC.UNSlGNED.al1; 
use IEEE. STD.LCGrC.ARITH.all; 

entity if.casc is 


port (ACDR.A: 

in 

std.logic.vectGr 

u 

dovnto 

01 ; - 

ADDRESS 

Code 

ADDR.B: 

in 

std.logic.vector 

u 

dawnto 

01; — 

ADDRESS 

Code 

ADDR.C: 

in 

std.logic.vcctcr 

u 

r 

2 

0 

0|; - 

ADDRESS 

Cooc 
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ADDR_D: 
RESET: 
CLK : 


DEC_Q: 

end if_casc; 


in std_logic_vcctor <1 downto 0»; — ADDRESS Code 
in std_logic; 
in std_logic; 

out std_1ogic_vcctor <5 downto Ol); — Decode OUTPUT 


architecture xilinx of if_case is 

signal ADDR_ALL : std_logic_vectcr <? downto 0); 

begin 


-concatenate all address lines - 

ADDR_ALL <= <ADDR_A £ ADDR_B & ADDR_C 4 ADDR_D) ; 


-Use * case' instead of * nested_i i* for efficient gate netlist 

IF_CA3E: process (CLK) 
begin 

if ICLK'event and CLK * '1' > then 
if <RESET = then 

case ADDR_ALL is 

when “ 00011011 " => 


DEC_Q|5 

downto 

41 

< = 

"00"; 



DEC_Q|3 

downto 

21 

< = 

unsigned <ADDR_A) 

♦ 

•3*i 

DEC_Q11 

downto 

0| 

< = 

unsigned <ADDR_B ) 

♦ 

•l'i 

when “ 000110 --" => 






DEC_Q(5 

downto 

41 

< = 

unsigned <ADDR_ D) 

♦ 

•l'j 

DEC_Q|3 

downto 

2l 

< = 

unsigned <ADDR_A) 

♦ 

'I'; 

DEC_Q|1 

downto 

01 

< = 

unsigned <ADDR_B) 

♦ 

•1*1 

when " 0001 -- 

—" => 






DEC_Q | S 

downto 

41 

< = 

ADDR_D; 



DEC_0 13 

downto 

21 

< = 

unsigned<ADDR_A) 


•l'i 

DEC_011 

downto 

0| 

< = 

unsigned <ADDR_B ) 


•l'i 

when "00 - 

—" => 






DEC_Q 15 

downto 

41 

< = 

ADDR_B; 



DEC_QI 3 

downto 

21 

< = 

"01"; 



DEC_0 11 

downto 

0| 

< = 

"00"; 



when others 

=> 






DEC_Q 15 

downto 

41 

< = 

ADDR_B; 



DEC_Q | 3 

downto 

21 

< = 

ADDR_A; 



DEC_Q 11 

downto 

01 

< = 

unsigned<ADDR_B) 

* 

•l*; 


end case; 

else 

DEC_Q <= “000000"; 
end if; 
end if; 
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and process; 
end xilinx; 

• Verilog Example 

//////////////////////////////////////// 

// IF_CASE.V // 

// Nested If vs. Case Design Exaspic // 
ft August 1997 ‘ // 

//////////////////////////////////////// 

module if_case <ADDR_A, ADDR_B, ADDR_C, ADDR_D, RESET, CLK, DEC_QJ; 

input (1:0] ADDR_A ; 
input (1:0] ADDR_B ; 
input (1:0] ADDR__C ; 
input (1:0] ADDR_D ; 
input RESET, CLK ; 

output (5:0) DEC_0 ; 

wire (7:0] ADDR_ALL ; 
reg (5:0] DEC_0 ; 

ft Concatenate all address lines ft 

assign ADDR_ALL = {ADDR_A, ADDR_B, ADDR_C, ADDR_D| ; 

ft Use 'case' instead of 'nested_if' for efficient gate netlist // 
always £ <poscdge CLK) 
begin 

if (RESET == l'bi) 
begin 

cascx (ADDR_ALL) 

8'bOOOHOl 1: begin 

DEC_0(S:4] <= 2' bOO; 

DEC_013:2] <= ADDR_A 1; 

DEC_O(l:0] <= ADDR_B l'bl; 

end 

8'bOOOUOxx: begin 

DEC_0[5:4) <= ADDR_D ♦ l'bl; 

DEC_Ol3:2] <= ADDR_A ♦ l'bl; 

DEC_O(l:0] <= ADDR_B l'bl; 

end 

8'bOOOixxxx: begin 

DEC.O(5:4) <= ADDR_D; 

DEC_013:2] <= ADDR_A ♦ l'bl; 

DEC_Oll:0] <= ADDR_B ♦ l'bl; 
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end 

8'bOOxxxxxx: begin 

DEC_Q [ S:4) <= ADDR_D; 
DEC_Q[3:2) <= 2 r bOl ; 
DEC_Q[1:0] <= 2 r bOO; 
end 

dc fault: begin 

DEC_015:4) <= ADDR_D; 
DEC_0[3:2) <= ADDR_A; 
DEC_Ofl:0] <= ADDR_B l'bl; 
end 

cndcasc 


end 

else 

DEC_Q <= 6'b0Q0000; 

end 
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Figure 2-11 Implementation ol If-Case 

Comparing If Statement and Case Statement 

The If statement generally produces priority-encoded logic and the 
Case statement generally creates balanced logic. An II statement can 
contain a set ol dillerent expressions while a Case statement Ls evalu¬ 
ated against a common controlling expression. In general, use the 
Case statement lor complex decoding and use the II statement lor 
speed critical paths. 

Mi»st current synthesis tools can determine il the il-elsif conditions 
aie mutually exclusive, and will not create extra logic to build the 
priority tree. 
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Tine following examples use an If construct in a 4-to-l multiplexer 
design. The "If Ex Implementation" figure shows the implementa¬ 
tion of these designs. 

4-lo-1 Multiplexer Design with It Construct 

• VHDL Example 

— IF.EX.VHD 
— May 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic — unsigned.all; 

entity if_cx is 

port (SEL: in STD_1/>GIC_VECT0R 11 downtc 0»; 

A, B r C ( D: in STD_LOGIC; 

WJX_OUT: out STD_LCGICJ; 
end i£_cx; 

architecture BEHAV of if_cx is 
begin 

IF_PRO: process (SEL,A,B, C, D) 
begin 

if (SEL="00") then MUX_OUT <= A; 
elsif (SEL-^Ql^) then MUX_OUT <= B; 
elsif (SEL=*10*) then MUX_CUT <= C; 
elsif (SEL=*11*) then MUX_CUI <= D; 
else MUX_OUI <= 'O'; 

end if; 

end process; --END IF_PRO 

end BEHAV; 

• Verilog Example 

// ie_ex.v // 

ft Example of a If statement showing a tf 

// nux created using priority encoded logic ft 
tf HDL Synthesis Design Guide for FPGAs // 

ft November 199? * // 

fftffiftfftfttftfttftfffftfftftfftftfftfftftff 

module if_cx (A, 3, C, D, SEL, MUX_OUT>; 
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input A, B, C, D; 

input (1:0] SEL; 
output MUX_OUT; 

rcg MUX_OUT; 

always & {A or B or C or D or SEL) 

begin 


if (SEL == 2* 

b00) 

mux_cx;t = 

A; 


else if (SEL 

== 

2 * bOl ) 

Mux_cx;r = 

B; 


else if (SEL 

== 

2 * blO) 

mux_cci = 

C; 


else if (SEL 

ss 

2 * bll) 

MUX_COT = 

D; 


else 



mux__cx;t = 

0; 



end 


endncdule 



Figure 2-12 lf_Ex Implementation 

The following Y'HDL and Verilog examples use a Case construct for 
the same multiplexer. The "Case Ex Implementation" figure shows 
the implementation of these designs. In these examples, the Case 
implementation requites only one XC4000 CLB while the If construct 
requires two CLBs in some synthesis tools. In this case, design the 
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multiplexer using the Case construct because fewer resources are 
used and the delay path is shorter. 

4-to-1 Multiplexer Design with Case Construct 

• VHDL Example 

— CASE_EX.VHD 
— May 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_legic_unsigned.ail; 

entity casc_cx is 

port (SEL: in STD_I/)GIC — VECTOR 11 downtc 0» ; 

A, B, C,D: in 5TD_1CGXC; 

MUX_CXJT: cut STD_L>DGIC»; 
end casc_cx; 

architecture BEHAV of casc_cx is 
begin 

CASE_PRO: process (SEL,A,B r C f D) 
begin 


case SEL 

is 



when 

“00" 

=>MUX_OUT <= A; 

when 

“01" 

=> 

MUX_OUT <= B; 

when 

“10" 

=> 

HUX_OUT <= C; 

when 

“11" 

=> 

MUX_OUT <= D; 

when 

othcrs=> 

MUX OUT <= ' 


end case; 

end process; --End CASE_PRO 


end BEHAV; 

• Verilog Example 

////////////////////////////////////////// 

tf CASE_EX.V // 

ft Example of a Case statement shewing // 

// A mux created using parallel logic tf 
tf HDL Synthesis Design Guide for FPGAs ft 
ft November 1997 // 

////////////////////////////////////////// 

module caso.cx (A, B, C, D, SEL, MUX_COTJ; 
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input 

A. B, C, D; 

input 

11:01 SEL; 

output 

MUX_OUT; 

reg 

MUX_OUT; 


always {? {A or B or C or D or SEL) 

begin 

ease (SEL) 

2* bOO: 

MUX_OUT = A; 

2'b01: 

MUX_OUT = B; 

2'blO: 

MUX.OOT = C; 

2'bil: 

MUX_017T = D; 
default: 

MUX_COT = 0; 

cndcase 

end 

endncdule 


O^CLfi 



••mi 


Figure 2-13 Case_Ex Implementation 
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Understanding High-Density Design Flow 


This chapter describes the steps in a typical HDL design flow. 
Although these steps may vary with each design, the information in 
this chapter is a good starting point for any design. If necessary, refer 
to the current version of the Quick Slait Guide for the Xilinx Alliance 
Series to familiarize yourself with the Xilinx and interface tools. This 
chapter includes the following sections. 

• "Design Flow" 

• "Entering your Design and Selecting Hierarch)'" 

• "Functional Simulation of your Design" 

• "Synthesizing and Optimizing your Design" 

• "Setting Timing Constraints" 

• "Evaluating Design Size and Performance" 

• "Evaluating your Design for Coding Style and System Features" 

• "Placing and Routing Your Design" 

• "Timing Simulation of Your Design" 

• "Downloading to the Device and In-system Debugging" 

• "Creating a PROM File for Stand-Alone Operation" 

Design Flow 

An overview of the design flow steps is shown in the following 
figure. 
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Figure 3-1 Design Row Overview 
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Entering your Design and Selecting Hierarchy 

Tine first step in implementing your design is creating the HDL code 
based on your design criteria. 

Design Entry Recommendations 

Tine following recommendations can help you create effective 
designs. 

Use RTL Code 

By using register transfer level (RTL) code and avoiding (when 
possible) instantiating specific components, you can create designs 
with the following characteristics. 

Note: In certain cases, instantiating optimized modules, such as Logi- 
BLOX modules, is beneficial with RTL. 

• Readable axle 

• Faster and simpler simulation 

• Portable code for migration to different device families 

• Reuse of code in future designs 

Carefully Select Design Hierarchy 

Selecting the airrect design hierarchy is advantageous for the 
following reasons. 

• Improves simulation and synthesis results 

• Modular designs are easier to debug and modify 

• Allows parallel engineering (a team of engineers can work on 
different parts of the design at the same time) 

• Improves the placement and routing of your design by reducing 
routing congestion and improving timing 

• Allows for easier code reuse in the current design, as well as in 
future designs 
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Functional Simulation of your Design 

Use functional or RTL simulation to verify the syntax and function¬ 
ality of your design. Use the following recommendations when simu¬ 
lating your design. 

• Typically with larger hierarchical HDL designs, you should 
perform separate simulations on each module before testing your 
entire design. This makes it easier to debug your code. 

• Once each module functions as expected, create a test bench to 
verify that your entire design functions as planned. You can use 
the test bench again for the final timing simulation to confirm 
that your design functions as expected under worst-case delay 
conditions. 

Synthesizing and Optimizing your Design 

Tills section includes recommendations for compiling your designs to 
improve your results and decrease the run time. 

Note: Refer to your synthesis tool documentation for more informa¬ 
tion on compilation options and suggestions. 

Creating an Initialization File 

Before you can compile your design, you must create an initialization 
file to specify compiler defaults, and to point to the applicable imple¬ 
mentation libraries. Refer to your synthesis tool documentation for 
information on creating this file. 

Creating a Compile Run Script 

Tine next step is to create a compile run script for iterative design 
compilations, and to use as a reference for the steps in the synthesis 
process. Many commonly-used synthesis tools have this capability. If 
you are a new user, you may want to use the graphical user interface 
to compile your design instead of using a run script. However, the 
iterative design compilation process can be tedious with the graph¬ 
ical interface. A run script can speed up the design process. 
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Compiling Your Design 

Use the recommendations in this section to successfully compile your 
design. 

Modifying your Design 

You may need to modify your code to successfully compile your 
design because certain design constructs that are effective for simula¬ 
tion may not be as effective for synthesis. The synthesis syntax and 
code set may differ slightly from the simulator syntax and code set. 

Compiling Large Designs 

Older versions of synthesis tools required incremental design compi¬ 
lations to decrease run times. Some or all levels of hierarchy were 
compiled with separate compile commands and saved as output or 
database files. The output netlist or compiled database file for each 
module was read during synthesis of the top level code. This method 
is not necessary with new synthesis tools, which can handle large 
designs from the top down. The 5,000 gates per module rule of thumb 
no longer applies with the new synthesis tools. Refer to your 
synthesis tool documentation for details. 

Saving Compiled Design as XNF or EDIF 

After your design is successfully compiled, save it as an XNF or EDIF 
file for input to the Xilinx software. 

Setting Timing Constraints 

You can define timing specifications for your design in the User 
Constraints File (UCF). The UCF gives you tight control of the overall 
specifications by giving you access to more types of constraints; the 
ability to define precise timing paths, and the ability to prioritize 
signal constraints. Furthermore, you can group signals together to 
simplify timing specifications. Some synthesis tools translate certain 
synthesis constraints to Xilinx implementation constraints. The trans¬ 
lated constraints are placed in a special T1MESPEC component. For 
more information on timing specifications in the UCF file, refer to the 
Quick Start Guide for Xilinx Alliance Series, the Libraries Guide, and the 
Answers Database on the Xilinx Web site (http;//www.xilinx.com). 
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Naming and Grouping Signals Together 

You can name and group signals with TNMs (Timing Names) or with 
TIMEGRPs (Time groups). TNMs and TIMEGRPs are placed on these 
start and end points: ports, registers, latches, or synchronous RAMs. 
The new specification, TPSYNC, allows you to define an asynchro¬ 
nous node for a timing specification. 

TNMs 

Timing Names are used to identify a port, register, latch, RAM, or 
groups of these components for timing specifications. TNMs are spec¬ 
ified from a UCF with the following syntax. 

:nst Instance Name 7HM=TNA1 Name; 

Instance Name is the name given to the port, register, latch, or RAM 
in your design. The instance names for any port or instantiated 
component are provided by you in your HDL code. Inferred flip- 
flops and latch names can usually be determined from the log files. 
TNM NAME is the arbitrary name you give the timing group. 

You can include several of these statements in the UCF file with a 
common TNM NAME to group elements for a timing specification 
as follows. 

NET DATA TNM-INPUT_PCRTS; 

NET SELECT TNM-INP'JT_PORTS; 

The above example takes two ports, DATA and SELECT, and gives 
them the common timing name INPUT PORTS. 

TIMEGRPS 

Time Groups are another method for specifying a group of compo¬ 
nents for timing specifications. 

Time groups use existing TNMs or TIMEGRPs to create new groups 
or to define new groups based on the output net that the group 
sources. There are several methods to create TIMEGRPs in the UCF 
file, as follows. 

TIMEGRP TIMEGRPJ^anie=7Um : TNM2 ; 

TIMEGRP TIMEGRP _Name=7tM3: EXCEPT: TNM4; 

The Xilinx software recognizes the following global timing names. 
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• FFS — All flip-flops in your design 

• PADS — All external poris in your design 

• RAMS — All synchronous RAMs in your design 

• LATCHES — All latches in your design 

The following time group specifies the group name, FAST FFS, 
which consists of all flip-flops in your design except for the ones with 
the TNM or T1MEGRP SLOW FFS attribute. 

TIMEGRP FAST_FFS-FFS:EXCEPT:SLQW_FFS; 

TPSYNC Specification 

In the latest version of the Xilinx software, you can define any node 
as a source or destination for a timing specification with the TPSYNC 
keyword. In synthesis designs, it is usually difficult to identify the net 
names for asynchronous paths of inferred logic. These net names can 
change fiom compile to compile, so it is not recommended to use this 
specification with inferred logic. However, with instantiated logic, 
the declared SIGNAL or WIRE name usually remains intact in the 
netlist and does not change from compile to compile. Some synthesis 
tools can preserv e the signal/net name defined in the RTL through 
the optimization process. Check with your synthesis vendor for this 
capability. The UCF syntax is as follows. 

NET Net_Nanie TPSY$IC=TPSYNC_Name; 

In the following NET statement, the TPSYNC is attached to the 
output net of a 3-state buffer, BUS3STATE. If a TPSYNC is attached to 
a net. then the source of the net is considered to be the endpoint (in 
this case, the 3-state buffer itself). The subsequent TI.V1ESPEC state¬ 
ment can use the TPSYNC name just as it uses a TNM name. 

NET BUS3STATE TPSYNC-bus3; 

TIMESPEC 

TSNewSpc3-FRCM:PAD(ENABLE_BUS):TO:bus3:20ns; 

Specifying Timing Constraints 

After your design signals are specified with TNMs. TIMEGRPs, or 
global timing names, you can place a specification on the design 
patlvs. There are a few methods for specifying these timing patlis and 
different specifications have different priorities. 


Si/ii/Jicsis and Simulation Design Guide 


3-7 




Synthesis and Simulation Design Guide 


Note: Current versions of the Xilinx implementation tools have 
improved methods for entering timing constraints. Refer to the Xilinx 
documentation for your version of the place and route tools for the 
latest constraints commands and styles. 

Period Constraint 

Tine Period constraint specifies a dock period or clock speed on a net 
or clock port. The Xilinx tools attempt to meet all Pad to Setup 
requirements, as well as all Clock to Setup delays for registers 
clocked by the specified clock net. This is equivalent to a create clock 
type of command in a synthesis tool script. Following are the two 
methods for specifying a period constraint. 

NET OockJ'Jame PERIOD = Clock. Period ; 

or 

NET Clock_Name Ttm=TNM_Name; 

IIMESPEC TIMESPEC. Name = PERIOD:TNM_Namr.Clock_Period 

t 

Tine following example specifies that the CLOCK port has a period of 
50ns. All input paths to flip-flops clocked with this port are desig¬ 
nated to operate at 50ns. 

NET CLOCK PERIOD - 50; 

FROM:TO Style Constraint 

Specific paths can be specified with a FROM.TO style tinning specifi¬ 
cation. These constraints are specified using global tinning names, 
TN'Ms. TIMEGRPs, or TPSYNJCs to connect the source and destina¬ 
tion of the timing path, as well as the desired maximum delay of the 
path. An equivalent synthesis tool command is a set max delay type 
of command. A UCF example follows. 

TIMESPEC TIMESPEC^Name = 

F RCM:Source_Namc.TO:Desinalion_Name:Delatf. VWue ; 

T1MESEPC Name is specified with the TS identifier followed by a 
number, such as TSOI. 
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Tine following example specifies a new timespec wilh fhe identifier 
TSOI so that all paths that are sourced by a port and end at a register 
grouped with the name DATA FLOPS have a delay less than 30ns. 

TIMESPEC TSOI - FROM:PADS:TO:DATA_FLOPS:30; 

Offset Constraint 

Note: The OFFSET constraint must be used with the clock PERIOD 
constraint. 

Tine OFFSET constraint can be applied to ports defined in your code, 
it defines the delay of a signal relative to a clock, and is only valid for 
registered data paths. The OFFSET constraint specifies the signal 
delay external to the chip, allowing the implementation tools to auto¬ 
matically adjust relevant internal delays (CLK buffer and distribution 
delays) to accommodate the external delay specified with this 
constraint. This constraint is equivalent to the set input delay and set 
output delay type of commands in your synthesis tool. 

NET Porl_Name OFFSET - | IN | OUT 1 Time (BEFORE | 
after; Clock_Name: 

IN I OUT specifies that the offset is calculated wilh respect to an 
input IOB or an output IOB. 

For a bidirectional IOB. the IN I OUT syntax leLs you specify the flow 
of data (input or output) on the IOB. BEFORE I AFTER indicates 
whether data is to arrive (input) or leave (output) the device before or 
after the clock input. 

Tire following example specifies that the data on the output port, 
DATA OUT, arrive on the output pin 20ns after the edge of the clock 
signal, CLOCK, arrives. 

NET DATA_OUT OFFSET - OUT 20 AFTER CLOCK; 

Ignoring Timing Paths 

When a timespec is issued for a path that is not timing-critical, you 
can specify to ignore this path for one or all timing specifications. A 
TIG (Timing IGnore) can be specified on these particular nets. The 
synthesis tool equivalent is the Set False Path command. The UCF 
syntax is as follows. 

NET Signal Name ~lG=TIMESPEC_Name ; 
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To ignore all liming constraints fora signal: 

NET Signal,Name TIG; 

To ignore an entire timing constraint: 

T ZG=TlMESPEC_Name: 

In the following example, the SLOW PATH net is set to ignore the 
timing constraint with the name TSOI. 

NET Slow .PATH TIG-TS01; 

Controlling Signal Skew 

You can control the maximum allowed skew in your designs. The 
maximum skew (MAXSKEW) is the difference between the longest 
and shortest driver-lo-load connection delays for a given net. The 
maximum and minimum delays are determined using worst case 
maximum delay values for each path. While this specification cannot 
guarantee that this maximum skew value is achieved in the actual 
device, it allows the software to minimize the amount of skew on the 
specified signal. This specification is useful for high-fanout nets when 
all available global buffers have been used for other critical signals. 
An example of the UCF syntax for this specification follows. 

NET Signal,Name MAXSKEW=Stap_VWi/e; 

The following example specifies that the CLOCK ENABLE signal 
should not have a skew value greater than 4ns. 

MET CLOCK_EMABLE MAXSKEW-<1; 

Timing Constraint Priority 

Timing constraints can be assigned priorities when paths are over¬ 
lapped by multiple timing constraints. Priorities can be directly spec¬ 
ified to a timing constraint as follows. 

TIME SPEC TIMESPEC,Niime = FROM Groupl TC Group2 

Delay_Value PRIORITY Priorily_Leivl; 

The lower the priority level, the higher the precedence. 

The following example sets a timespec where the source is a time 
group labeled THESE FFS and the destination is labeled 
THOSE FFS, with a delay value of 25ns and a priority level of 2. 
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TIMESPEC TS04-FROM:THESE FFS: TO: THOSE.. FFS : 25 
PRIORITY 2; 

However. liming constraints have an inherent precedence that is 
based on the type of constraint and the site description provided to 
the tools. If two constraints are of the same priority and cover the 
same path, then the last constraint in the constraint file overrides any 
other constraints that overlap. 

Inherent timing constraint priority is shown in the following table. 

Note: You cannot assign a priority to override inherent timing 
constraint priority. You can set priorities for different timing within 
the same constraint type. 


Table 3-1 Precedence of Constraints 


Across Constraint Sources 

Highest 

Priority 

Physical Constraint File (PCF) 


User Constraint File (UCF) 

Lowest 

Priority 

Input Netlist / Netlist Constraint File (NCF) 

Within Constraint Sources 

Highest 

Priority 

TIG (Timing Ignore) 


FROM:USER1:THRU:USER T:TO:USER2 Specification 
(USER1 and USF.R2 ate user-defined groups) 


FROM:USER 1 :TH RU:USER T:TO:FFS Specification or 
FROM:FFS:THRU.USER T.TO:USER2 Specification 
(FFS is any pre-defined group) 


FROM:FFS:THRU:USER TiTO.FFS Specification 


FROM:USER 1 TO:USF.R2 Specification 


FROM:USER 1 :TO:FFS Specification or 
FROM:FFS:TO.USER2 Specification 


FROM:FFS:TO.FFS specification 
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Table 3-1 Precedence of Constraints 



Period specification 

Lowest 

Priority 

"Allpaths" type constraints 


Evaluating Design Size and Performance 

Your design should meet the following requirements. 

• Design must function at the specified speed 

• Design must fit in the targeted device 

After your design is compiled, you can determine preliminary device 
utilization and performance with your synthesis tool's reporting 
options. After your design is mapped by the Xilinx tools, you can 
determine the actual device utilization. At this point in the design 
flow, you should verify that your chosen device is large enough to 
incorporate any future changes or additions, and that your design 
will perform as specified. 

Using your Synthesis Tool to Estimate Device 
Utilization and Performance 

Use your synthesis tool's area and timing reporting options to esti¬ 
mate device utilization and performance. After compiling, use the 
report area command to obtain a report of device resource utilization. 
Some synthesis tools provide area reports automatically. Refer to 
your synthesis tool documentation for correct command syntax. 

Note: See the "Report Files" appendix for sample report files from 
various synthesis vendors. 

Tills report lists the compiled cells in your design, as well as informa¬ 
tion on how your design is mapped in the FPGA. Hiese reports are 
generally accurate for the XC4000 and Spartan family because the 
synthesis tool creates the logic from your code and maps your design 
into the FPGA. However, these reports are different for the various 
synthesis tools. Some reports specify the minimum number of CLRs 
required, while other reports specify the "unpacked" number of 
CLBs to make an allowance for routing. For an accurate comparison, 
you should compare reports from the Xilinx place and route tool after 
implementation. Also, any instantiated components, such as Logi- 
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BLOX modules, EDIF tiles. XNF files, or other components that your 
synthesis tool does not recognize during compilation are not 
included in the report tile. If you include these components in your 
design, you must include the logic area used by these components 
when estimating design size. Also, sections ot your design may get 
trimmed during the mapping process, and may result in a smaller 
design. 

Using the Timing Report Command 

Use your synthesis tool’s timing report command to obtain a report 
with estimated data path delays. Reter to your synthesis vendor's 
documentation for command syntax. 

Note: See the "Report Files” appendix tor sample report tiles from 
various synthesis vendors. 

Tills report is based on the logic level delays from the cell libraries 
and estimated wire-load models tor your design. Tills report is an 
estimate ot how close you are to your timing goals; however, it is not 
the actual timing tor your design. An accurate report of your design's 
timing is only available alter your design is placed and routed. This 
timing report does not include information on any instantiated 
components, such as LogiBLOX modules. EDIF tiles, XNF tiles, or 
other components that are not recognized by your synthesis tool 
during compilation. 

Determining Actual Device Utilization and Pre-routed 
Performance 

To determine if your design tits the specified device, you must map it 
with the Xilinx Wap program. The generated report file 
design jtame.m rp contains the implemented device utilization infor¬ 
mation. You can run the Map program from the Design Manager or 
from the command line. 

Using the Design Manager to Map Your Design 

Use the following steps to map your design using the Design 
Manager. 

Note: For more information on using the Design Manager, see the 
Design Manager/Flow Engine Reference/User Guide. 
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1. To start the Design Manager, enter the following command. 

xilinx 

2. To create a new project, select the XNF or ED1F file generated by 
your synthesis tool as your input file from the File —> New 
Project menu command. 

3. To start design implementation, click the Implement toolbar 
button or select Design -» Implement. 

The Implement dialog box appears. 

4. If necessary, select a part in the dialog box. 

5. Select the Options button in the Implement dialog box. 

Tine Options dialog box appears. 

6. Select the Produce Logic Level Timing Report option. 

Tills option creates a timing report prior to place and route, but 
after map, as described in the following five steps. 

7. Select the Edit Template button next to the Implementation drop¬ 
down list. 

Tine Implementation Template dialog box appears. 

8. Select the Timing tab. 

9. Select tine Produce Logic Level Timing Report radio button. 

10. Select the type of report you want to create. 

Tine default is Report Paths in Tinning Constraints. 

11. Use the Implementation Template dialog box tabs (Optimize & 
Map, Place & Route, or Interface) to select any other options 
applicable to your design. Select OK to exit the Implementation 
Template dialog box. 

Note: Xilinx recommends using the default Map options for your 
designs. Also, do not use the guided map option with your synthe¬ 
sized designs. 

12. Select Run in the Implement dialog box to begin implementing 
your design. 
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13. When Ihe Flow Engine is displayed, slop Ihe processing ol your 
design a/ler mapping by selecting Setup —> Stop After or by 
selecting the Set Target toolbar button. 

The Stop After dialog box appears. 

14. Select Map and select OK. 

15. After the Flow Engine is finished mapping your design, select 
Utilities -> Report Browser to view the map report. 
Double-click the report icon that you want to view. The map 
report includes a Design Summary section that contains the 
device utilization in formation. 

16. View Ihe Logic Level Timing Report with the Report Browser. 
Tills report shows the performance of your design based on logic 
levels and best-case routing delays. 

17. At this point, you may want to start the Timing Analyzer from 
the Design Manager to create a more specific report of design 
paths. 

18. Use Ihe Logic Level Timing Report and any reports generated 
with the Timing Analyzer or the Map program to evaluate how 
close you are to your performance and utilization goals. Use 
these reports to decide whether to proceed to the place and route 
phase of implementation, or to go back and modify your design 
or implementation options to attain your performance goals. You 
should have some slack in routing delays to allow the place and 
route tools to successfully complete your design. Use the verbose 
option in the Timing Analyzer to see block-by-block delay. The 
timing report of a mapped design (before place and route) shows 
block delays, as well as estimated routing delays. 

Using the Command Line to Map Your Design 

1. Translate your design as follows, 
ngdbuild -p target .device design name . xnf 

2. Map your design as follows, 
map design jiame . ngd 

3. Use a text editor to view the Device Summary section of the 
design _)uwif.mrp map report. This section contains the device 
utilization information. 
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4. Run a liming analysis ol the logic level delays from your mapped 
design as follows. 

tree [options] design jumte.nqd 

Note: For available oplions, enler only the tree command at Ihe 
command line without any arguments. 

Use Ihe Trace reports to evaluate how close you are to your 
performance goals. Use the report to decide whether to proceed 
to the place and route phase of implementation, or to go back and 
modify your design or implementation options to attain your 
performance goals. You should have some slack in routing delays 
to allow Ihe place and route tools to successfully complete your 
design. 

The following is the Device Summary section of a Map report. 

Design Sumrary 


Number of errors: 

0 





Number of warnings: 

3 





Number of CLBs: 


39 

out 

of 

100 

CLB Flip Flops: 

32 





4 input LOTs: 

66 





3 input LUTs: 

5 





Number of bended ICBs: 


30 

out 

of 

61 

IOB Flops: 

0 





IOB Latches: 

0 





Number of secondary CLXs: 

1 

out 

of 

4 

Number of oscillators: 


1 




Number of STARTUPS: 


1 




Number of READCLKs: 


1 




Number of READBACXs: 


1 




Number of MDO pads: 


1 




Number of MDl pads: 


1 





Total equivalent gate count for design: 1538 
Additional JTAG gate count for lOBs: 1536 

The following is a sample Logic Level Timing Report. 


39% 


49% 

25% 


Xilinx TRACE, Version Ml.4.12 

Copyright <c> 1995-1997 Xilinx, Inc. Ail rights reserved. 

Design file: sap.ncd 

Physical constraint file: demo_.beard.pcf 
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Device,speed: xc4003e,-2 (xl_O.06 PRELIMINARY) 

Report level: sunury report 


~ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — :z — — — — — r— — = — — — — — — — — — — — — — r 

Timing constraint: MET H FAST_CLDCK* PERIOD - 15.200 nS HIGH 50.000 % j 

t iten analyzed, 0 timing errors detected. 

Minimum period is S.585ns. 


— — — — ~— — z: — ~z: — — z:—— — : 

Timing constraint: MET "control_logic/SLC«_CLCX:K" PERIOD - 121.600 nS 

HIGH 50.000 4 ; 

677 items analyzed, 0 timing errors detected. 

Minimum period is 17.295ns. 


All constraints were net. 
Timing surinary: 


Timing errors: 0 Score: 0 

Constraints cover 811 paths, 0 nets, and 232 connections |73.2% coverage) 
Design statistics: 

Minimum period: 17.295ns iMaxinun frequency: 57.020MHz) 

Analysis conpleted Tue Jan 27 12:07:59 1998 


Evaluating your Design for Coding Style and 
System Features 

At this point, if you are not satisfied with your design performance, 
you can re-evaluate your code and make any necessary improve¬ 
ments. Modifying your code and selecting different compiler options 
can dramatically improve device utilization and speed. 


Synthesis and Simulation Design Guide 


3-17 










Synthesis ami Simulation Design Guide 


Tips for Improving Design Performance 

This .section includes ways of improving design performance by 
modifying your code and by incorporating FPGA system features. 
Most of these techniques are described in more detail in this manual. 

Modifying Your Code 

You can improve design performance with the following design 
modifications. 

• Reduce levels of logic to improve timing 

• Redefine hierarchical boundaries to help the compiler optimize 
design logic 

• Pipeline 

• Logic replication 

• Use of LogiBLOX or Coregen modules 

• Resource sharing 

• Restructure logic 

Using FPGA System Features 

After correcting any coding style problems, use any of the following 
FPGA system features in your design to improve resource utilization 
and to enhance the speed of critical paths. 

Note: Each device family has a unique set of system features. Review 
the current version of the The Programmable Logic Data Scot for the 
system features available for the device you are targeting. 

• Use global set/reset and global tri-state nets to reduce routing 
congestion and improve design performance 

• Use clock enables 

• Place the highest fanout signals on the global buffets 

• Modify large multiplexers to use tri-state buffers 

• Use one-hot encoding for large or complex state machines 

• Use I/O registers when applicable 

• Use I /O decoders when applicable 
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• Use I/O multiplexers when applicable 

Using Xilinx-specific Features of Your Synthesis Tool 

Most synthesis tools have special options for the Xilinx-specific 
features listed in the previous section. Refer to your synthesis tool 
documentation for help on using Xilinx-specific features. 

Placing and Routing Your Design 

Note: For more information on placing and routing your design, refer 
to the Development System Reference Guide. 

Tine overall goal when placing and routing your design is fast imple¬ 
mentation and high-quality results. However, depending on the situ¬ 
ation and your design, you may not always accomplish this goal, as 
described in the following examples. 

• Earlier in the design cycle, run time is generally more important 
than the quality of results, and later in the design cycle, the 
converse is usually true. 

• During the day. you may want the tools to quickly process your 
design while you are waiting for the results. However, you may 
be less concerned with a quick run time, and more concerned 
about the quality of results when you run your designs for an 
extended period of time (during the night or weekend). 

• If the targeted device is highly utilized, the routing may become 
congested, and your design may be difficult to route. In this case, 
the placer and router may take longer to meet your timing 
requirements. 

• If design constraints are rigorous, it may take longer to correctly 
place and route your design, and meet the specified timing. 

Decreasing Implementation Time 

The options you select for the placement and routing of your design 
directly influence the run lime. Generally, these options decrease the 
run time at the expense of the best placement and routing for a given 
device. Select your options based on your required design perfor- 
mance. 
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Note: It you are using the command line, the appropriate command 
line option is provided in the following procedure. 

Use the following steps to decrease implementation time in the 
Design Manager. 

1. Select Design -» Implement 
The Implement dialog box appears. 

2. Select the Options button in the Implement dialog box. 

Tire Options dialog box appears. 

3. Select the Edit Template button next to the Implementation drop¬ 
down list in the Program Options Templates field. The Imple¬ 
mentation Template dialog box appears. 

4. Select the Place & Route tab. 

5. Set options in this dialog box as follows. 

• Place & Route Effort Level 

Generally, you can reduce placement times by selecting a less 
CPU-intensive algorithm for placement. You can set the 
placement level from 1 (fastest run time) to 5 (best results) 
with the default equal to 2. Use the -1 switch at the command 
line to perform the same function. 

Note: In some cases, poor placement with a lower placement level 
setting can result in longer route times. 

• Router Options 

You can limit router iterations to reduce routing times. 
However, this may prevent your design from meeting timing 
requirements, or your design may not completely route. 
From the command line, you can control router passes with 
the -i switch. 

• Use Timing Constraints During Place and Route 

You can improve run times by not specifying some or all 
timing constraints. This is useful at the beginning of the 
design cycle during the initial evaluation of the placed and 
routed circuit. To disable timing constraints in the Design 
Manager, deselect the Use Timing Constraints During Place 
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and Route button. To disable timing constraints at the 
command line, use the -x switch with PAR. 

6. Select OK to exit the Implementation Template dialog box. 

7. Select any applicable options in the Options dialog box. 

8. Select OK. 

9. Select Run in the Implement dialog box to begin implementing 
your design. 

Improving Implementation Results 

Conversely, you can select options that increase the run time, but 
pn>duce a better design. These options generally produce a faster 
design at the cost of a longer run time. These options are useful when 
you run your designs for an extended period of time (overnight or 
over the weekend). 

Multi-Pass Place and Route Option 

Use this option to place and route your design with several different 
cost tables (seeds) to find the best possible placement for your design. 
This optimal placement results in shorter routing delays and faster 
designs. This option works well when the router passes are limited 
(with the -i option). After an optimal cost table is selected, use the re¬ 
entrant routing feature to finish the routing of your design. You may 
select this option from the Design menu in the Design Manager, or 
specify this option at the command line with the -n switch. 

Turns Engine Option (UNIX only) 

This option is a Unix-only feature that works with the Multi-Pass 
Place and Route option to allow parallel processing of placement and 
routing on several Unix machines. The only limitation to how many 
cost tables are concurrently tested is the number of workstations you 
have available. To use this option in the Design Manger, specify a 
node list when selecting the Multi-Pass Place and Route option. To 
use this feature at the command line, use the -m switch to specify a 
node list, and the -n switch to specify the number of place and route 
iterations. 

Note: For more information on the turns engine option, refer to the 
Xilinx Development System Reference Guide. 
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Re-entrant Routing Option 

Use I he re-entrant muting option to further route an already muted 
design. The router reroutes some connections to improve the timing 
or to finish routing unmuted nets. You must specify a placed and 
routed design (.ncd) file for the implementation tools. This option is 
best used when router iterations an? initially limited, or when your 
design timing goals are close to being achieved. 

From the Design Manager 

To initiate a re-entrant route from the Design Manager interface, 
follow these steps. 

1. From the Design Manager, select the placed and routed design 
revision for the re-entrant option. 

2. Select Tools -» Flow Engine to start the Row Engine from the 
Design Manager. 

3. From the Flow Engine menu, select Setup -» Re-entrant 
Route. 

4. In the Advanced dialog box that is displayed, select the Allow 
Re-entrant Routing option. 

5. Select the appropriate options in the Re-entrant Route Options 
field. 

6. Select OK. 

7. Tine Place and Route icon in the Flow Engine is replaced with the 
Re-entrant Route icon. If this step is completed, use the Step Back 
button until the Re-entrant Route icon no longer indicates 
completed. 

8. Select Run to complete the re-entrant routing. 

From the Command Line 

To initiate a re-entrant route from the command line, you can run 
PAR with the -k and -p options, as well as any other options you 
want to use for the routing process. You must either specify a unique 
name for the post re-entrant routed design (.ncd) file or use the -w 
switch to overwrite the previous design file, as shown in the 
following examples. 

pat -k -p other ^options design jtame . ncd newjiame . ncd 
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par -k -p -w other ^options design jianie. ncd design, ncd 

Cost-Based Clean-up Option 

Tills option specifies dean-up passes after routing is completed to 
substitute mom appropriate routing options available from the initial 
routing process. For example, if several local routing resources are 
used to transverse the chip and a longline is available, the longline is 
substituted in the clean-up pass. The default value of cost-based 
cleanup passes is 1. To change the default value, use the Template 
Manager in the Design Manager, or the -c switch at the command 
line. 

Delay-Based Clean-up Option 

Tills option specifies clean-up passes after routing is completed to 
substitute more appropriate routing options to reduce delays. The 
default number of passes for delay-based clean-up is 0. You can 
change the default in the Design Manager in the Implementation 
Options window, or at the command line with the -d switch. 

Guide Option (not recommended) 

Tills option is generally not recommended for synthesis-based 
designs. Re-synthesizing modules can cause the signal and instance 
names in the resulting netlist to be significantly different from those 
in earlier synthesis runs. This can occur even if the source level code 
(Verilog or VHDL) contains only a small change. Because the guide 
process is dependent on the names of signals and comps, synthesis 
designs often result in a low match rate during the guiding process. 
Generally, this option does not improve implementation results. 

Timing Simulation of Your Design 

Note: Refer to the ' Simulating Your Design" chapter for more infor¬ 
mation on design simulation. 

Timing simulation is important in verifying the operation of your 
circuit after the worst-case placed and routed delays are calculated 
for your design. In many cases, you can use the same test bench that 
you used for functional simulation to perform a more accurate simu¬ 
lation with less effort. You can compare the results from the two 
simulations to verify that your design is performing as initially speci- 


Synlhesis and Simulation Design Guide 


3-23 




Synthesis and Simulation Design Guide 


tied. The Xilinx tools create a VHDL or Verilog simulation netlist of 
your placed and routed design, and provide libraries that work with 
many common HDL simulators. 

Downloading to the Device and In-system 
Debugging 

After you have verified the functionality and timing of your placed 
and routed design, you can create a design data file to download for 
in-system verification. The design data or bitstream (.bit) file is 
created from the placed and routed .ncd file. In the Design Manager, 
use the Configuration step in the Row Engine to create this file. From 
the command line, run BitCen on your placed and routed .ncd file to 
create the .bit file as follows. 

bitgen [options] design. ncd 

Use the .bit file with the XChecker cable and the Hardware Debugger 
to download the data to your device. You can run the Hardware 
Debugger from the Design Manager, or from the command line as 
follows. 

hwdebugt design .bit 

Tine Hardware Debugger allows you to download the data to the 
FPGA using your computer's serial port. The Hardware Debugger 
can also synchronously or asynchronously probe external or internal 
nodes in the FPGA. Waveforms can be created from this data and 
correlated to the simulation data for true in-system verification of 
your design. 

Creating a PROM File for Stand-Alone Operation 

After verifying that the FPGA works in the circuit, you can create a 
PROM file from the .bit file to program a PRO.Vl or other data storage 
device. You can then use this file to program the FPGA in-circuit 
during normal operation. 

Use the Prom File Formatter to create the PROM file, or from the 
command line use I’ROMGen. You can run the Prom File Formatter 
from the Design Manager, or from the command line as follows. 

promfmtt design . bit 

Run PROMGen from the command line by typing the following. 
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pcomgen (options] design .bit 

Note: For mon 1 information on using these programs, refer to the 
Xilinx Development System Reference Guide. 
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Designing FPGAs with HDL 


This chapter includes coding techniques to help you improve 
synthesis results. It includes the following sections. 

• "Introduction" 

• "Using Global Low-skew Clock Buffers" 

• "Using Dedicated Global Set/Reset Resource" 

• "Encoding State Machines" 

• "Using Dedicated I/O Decoders" 

• "Instantiating LogiBLOX Modules" 

• "Implementing Memory’" 

• "Implementing Boundary Scan (JTAG 1149.1)" 

• "Implementing Logic with IOBs" 

• “Implementing Multiplexers with Tristate Buffers" 

• "Using Pipelining" 

• "Design Hierarchy" 

Introduction 

Xilinx FPGAs provide the benefits of custom CMOS VLSI and allow 
you to avoid the initial cost, time delay, and risk of conventional 
masked gate array devices. In addition to the logic in the CLBs and 
IOBs, the XC4000 family, XC5200 family, and Spartan family FPGAs 
contain system-oriented features such as the following. 

• Global low-skew clock or signal distribution network 

• Wide edge decoders (XC4G00 family only) 
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• On-chip RAM and ROM (XC4000 family and Spartan) 

• IEEE 1149.1 — compatible boundary scan logic supporl 

• Flexible I/O with Adjustable Slew-rate Control and Pull-up/ 
Pull-down Resistors 

• 12-mA sink current per output and 24-mA sink per output pair 

• Dedicated high-speed carry-propagation circuit 

You can use these device characteristics to improve resource utiliza¬ 
tion and enliance the speed of critical paths in your HDL designs. The 
examples in this chapter are provided to help you incorporate these 
system features into your HDL designs. 

Using Global Low-skew Clock Buffers 

For designs with global signals, use global clock buffers to take 
advantage of the low-skew, high-drive capabilities of the dedicated 
global buffer tree of the target device. When you use the Insert Pads 
or equivalent command, your synthesis tool automatically inserts a 
BUFG generic clock buffer whenever an input signal drives a clock 
signal. The Xilinx implementation software automatically selects the 
clock buffer that is appropriate for your specified design architecture. 
If you want to use a specific global buffer, you must instantiate it. 
Many synthesis tools automatically insert I/O pins and clock buffers. 
Also, some synthesis tools limit I/O and global buffers. Refer to your 
synthesis tool documentation for detailed information. 

You can instantiate an architecture-specific buffer if you unde island 
the architecture and want to specify how the resources should be 
used. Each XC41KXIE/L and Spartan device contains four primary and 
four secondary global buffers that share the same routing resources. 
XC4000EX/XLA/XL/XV devices have sixteen global buffets; each 
buffer lias its own routing resources. XC52Q0 devices have four dedi¬ 
cated global buffers in each corner of the device. 

XC4IXX) EX/XLA/XL/XV devices have two different types of global 
buffer. Global Low-Skew Buffers (BUFGLS) and Global Early Buffers 
(BUFGE). Global Low-Skew Buffers are standard global buffers that 
should be used for most internal clocking or high fanout signals that 
must drive a large portion of the device. There are eight BUFGLS 
buffers available, two in each corner of the device. The Global Early 
Buffers are designed to provide faster clock access, but CLB access is 
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limited to one qu. 1 dr. 1 nt of the device. I/O access is also limited. Simi¬ 
larly, them am eight BUFGEs, two in each corner of the device. 

Because Global Early and Global Low-Skew Buffers sham a single 
pad. a single IPAD can drive a BUFGE, BUFGLS, or both in parallel. 
Tire parallel configuration is especially useful for clocking the fast 
capture latches of the device. Since the Global Early and Global Low- 
Skew Buffers share a common input, they cannot be driven by two 
different signals. 

You can use the following criteria to help select the appropriate 
global buffer for a given design path. 

• Tire simplest option is to use a Global Low-Skew Buffer. 

• If you want a faster clock path, use a BUFG. Initially, the software 
will try to use a Global Low-Skew Buffer. If timing requirements 
am not met, a BUFGE is automatically used if possible. 

• If a single quadrant of the chip is sufficient for the clocked logic, 
and tinting requires a faster clock than the Global Low-Skew 
Buffer, use a Global Early Buffer. 

Note: For mom information on using the XC4000 EX/XLA/XL/XV 
device family global buffers, refer to the online version of Tin' 
Programmable Logic Data Book or the Xilinx web site at http:// 
www.xilinx.com. 

For XC41XXIE/L and Spartan devices, you can use secondary global 
buffers (BUFGS) to buffer high-fanout, low-skew signals that are 
sourced from inside the FPGA. To access the secondary global clock 
buffer for an internal signal, instantiate the BUFGS cell. You can use 
primary global buffers (BUFGP) to distribute signals applied to the 
FPGA from an external source. Internal signals can be globally 
distributed with a primary global buffer, however, the signals must 
be driven by an external pin. 

Some synthesis tools limit I/O or BUFG resources. For example, 
BUFG does not synthesize to more than eight instances depending on 
the selected device architecture. However, some tools do not use all 
your available resources. Compiling modules separately may also 
result in resource over-utilization. Check with your synthesis vendor. 

XC4000E/L and Spartan devices have four primary (BUFC.P) and 
four secondary (BUFGS) global clock buffers that share four global 
routing lines, as shown in the following figure. 
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Figure 4-1 Global Buffer Routing Resources (XC4000E. 
Spartan) 

These global routing resources are only available for the eight global 
buffers. The eight global nets run horizontally across the middle of 
the device and can be connected to one of the four vertical longlines 
that distribute signals to the CLBs in a column. Because of this 
arrangement only four of the eight global signals are available to the 
CLBs in a column. These routing resources are "free" resources 
because they are outside of the normal routing channels. Use these 
resources whenever possible. You may want to use the secondary 
buffers first because they have more flexible routing capabilities. 

You should use the global buffer routing resources primarily for high- 
fanout clocks that require low skew, however, you can use them to 
drive certain CLB pins, as shown in the following figure. In addition, 
you can use these routing resources to drive high-fanout clock 
enables, clear lines, and the clock pins (K) of CLBs and lOBs. 

In the following figure, the C pins drive the input to the H function 
generator. Direct Data-in, Preset. Clear, or Clock Enable pins. The F 
and G pins are the inputs to the F and G function generators, respec¬ 
tively. 
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Figure 4-2 Global Longlines Resource CLB Connections 

If your design does not contain four high-fanout clocks, use these 
routing resources for signals with the next highest fanout. To reduce 
routing congestion, use the global buffers to route high-fanout 
signals. These high-fanout signals include clock enables and reset 
signals (not global reset signals). Use global buffer routing resources 
to reduce routing congestion, enable routing of an otherwise 
unroutable design; and ensure that routing resources are available for 
critical nets. 

Xilinx recommends that you assign up to four secondary global clock 
buffers to the four signals in your design with the highest fanout 
(such as clock nets, clock enables, and reset signals). Clock signals 
that require low skew have priority over low-fanout non-clock 
signals. You can source the signals with an input buffer or a gate 
internal to the design. Generate internally sourced clock signals with 
a register to avoid unwanted glitches. The synthesis tool can insert 
global clock buffers or you can instantiate them in your HDL code. 

Note: Use Global Set/Reset resources when applicable. Refer to the 
"Using Dedicated Global Set/Reset Resource" section in this chapter 
for more information. 
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Inserting Clock Buffers 

Many synthesis tools automatically insert a secondary global clock 
butter on all input ports that drive a register's clock pin or a gated 
clock signal. Refer to your synthesis tool documentation tor informa¬ 
tion on disabling the automatic insertion ot clock buffers, and how to 
specify which ports have clock buffers. 

Instantiating Global Clock Buffers 

You can instantiate global buffers in your code as described in this 
section. 

Instantiating Buffers Driven from a Port 

You can instantiate global buffers and connect them to high-fanout 
ports in your code rather than inferring them from a synthesis tool 
script. If you do instantiate global buffers, verify that the Pad param¬ 
eter is not specified for the buffer. 

Instantiating Buffers Driven from Internal Logic 

Some synthesis tools require you to instantiate a global buffer in your 
code to use the dedicated routing resource if a high-fanout signal is 
sourced from internal flip-flops or logic (such as a clock divider or 
multiplexed clock), or if a clock is driven from the internal oscillator 
or non-dedicated I/O pin. Tine following VHDL and Verilog exam¬ 
ples instantiate a BUFGS for an internal multiplexed clock circuit. A 
Set Dont Touch or equivalent attribute is added to the instantiated 
component to prevent further optimization by the synthesizer. 
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• VHDL Example 


— CLDCX_W;X.VHD Version 1.1 

-- This is an example of an instantiation of — 
-- global buffer <BUFG3) from an internally 
-- driven signal, a multiplexed clock. 

— March 1990 


library IEEE; 

use IEEE.std_lcgic_1164.ali; 
entity clcck_mux is 

port (DATA, SEL: in STD_LOG 1C; 

SLaW_CLOCK, FAST_CLCCK: in STD_LOGIC; 

DOOT: out STD_LOGIC); 

end clock_mux; 

architecture X1LINX of clock_nux is 

signal CLOCK: 3TD_LCGIC; 

signal CLCCK_GBUF: 3TD_LOGIC; 

component BUFG3 

port (I: in STD_LOGIC; 

O: out STD_LOGIC); 
end conponcnt; 

begin 

Ciock_MUX: process (SEL, FA3T_CLCCK, 3LOK_CLOCK) 
begin 

if (SEL = ' 1 *) then 

CLOCK <= FA3T_CLCCK; 

else 

CLOCK <= 3LGM_CIX>CX; 
end if; 
end process; 

GBUF_FGR_WJX_CLOCX: BUFGS 
port map (I => CLOCK, 

O => CLOCK_GBUF); 

Data_Path: process (CLCCK_GBUF) 
begin 


Synthesis and Simulation Design Guide 


4-7 






Synthesis and Simulation Design Guide 


if (CLOCK_GBUF f event and CLOCK_GBUF='1'» then 
COUT <= DATA; 
end if; 
end process; 

end XXLINX; 

• Verilog Example 

ftfftftfftftfftfffftfffftfffftfftfffftfffftffff 
If CLOCK_WJX.V Version 1.1 // 

// This is an exanplc of an instantiation of // 

// global buffer (BUFGS) fron an internally if 
if driven signal, a multiplod clock. ft 

ft torch 199& if 

if if ft fitfif it ftfif ftffif if iff ifftfiff ifffitfft 

module clcck_mux (DATA, SEL, SX-GK_CX-aCX, FAST_CLOCK, 
DCCTJ ; 

input DATA, SEL; 

input SLDW_CDDCX, FAST_CLOCK; 

output DOUT; 

reg CLOCK; 
wire CI/)CX_GBUF; 
reg DOUT; 


always £ <SEL or FAS7_C1>DCX cr SLCW_CLCCK} 
begin 

if (SEL =- l r bl) 

CLOCK <= FAS7_C1JDCX; 

else 

CLOCK <= SLOX_CLOCX; 

end 


BUFGS GBUF_FCR_MJX_CLOCK I .<3 (CLOCK_GBUF) , 

.I(CLOCK)J; 

always & {posedge CLOCK_GBUF) 

DCU7 = DATA; 

endncdule 
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Using Dedicated Global Set/Reset Resource 

XC4000 and Spartan devices have a dedicated Global Set/Reset 
(GSR) net that you can use to initialize all CLBs and lOBs. When the 
GSR is asserted, every flip-flop in the FPGA is simultaneously preset 
or cleared. You can access the GSR net from the GSR pin on the 
STARTUP block or the GSRIN pin of the STARTBUF fVHDL). 

Since the GSR net has dedicated routing resources that connect to the 
Preset or Clear pin of the flip-flops, you do not need to use general 
purpose routing or global buffer resources to connect to these pins. If 
your design lias a Preset or Clear signal that affects every flip-flop in 
your design, use the GSR net to increase design performance and 
reduce routing congestion. 

The XCS200 family has a dedicated Global Reset (GR) net that resets 
all device registeis. As in the XC4000 and Spartan devices, the 
STARTUP or STARTBUF (VHDL) block must be instantiated in your 
code in order to access this resource. Hie XC3000A devices also have 
dedicated Global Reset (GR) that is connected to a dedicated device 
pin (see device pinout). Since this resource is always active, you do 
not need to do anything to activate this feature. 

For XC4000, Spartan, and XC52IX) devices, the Global Set/Reset (GSR 
or GR) signal is, by default, set to active high (globally resets device 
when logic equals 1). If you are using an older version of a synthesis 
tool, for an active low reset, instantiate an inverter in your code to 
invert the global reset signal. The inverter is absorbed by the 
STARTUP block and does not use any device resources (function 
generators). For older versions of synthesis tools, although the 
inverted signal may be behaviorally described in your code, Xilinx 
recommends instantiating the inverter to prevent the mapping of the 
inverter into a CLB function generator, and subsequent delays to the 
reset signal and unnecessary use of device resources. Also make sure 
you put a Don't Touch attribute on the instantiated inverter before 
compiling your design. If you do not add this attribute, the inverter 
may get mapped into a CLB function generator. Most new synthesis 
tools automatically insert the STARTUP block and the previous steps 
are not required. 

Note: For more information on simulating the Global Set/Reset, see 
the "Simulating Your Design" chapter. 
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Startup State 

Note: See the "Simulating Your Design" chapter for more informa¬ 
tion on STARTUP and STARTBUF. 

Tire GSR pin on the STARTUP block or the GSRIN pin on the 
STARTBUF block drives the GSR net and connects to each flip-flop's 
Preset and Clear pin. When you connect a signal from a pad to the 
STARTUP block's GSR pin. the GSR net is activated. Because the GSR 
net is built into the silicon it does not appear in the pie-routed netlist 
file. When the GSR signal is asserted High (the default), all flip-flops 
and latches are set to the state they were in at the end of configura¬ 
tion. When you simulate the routed design, the gate simulator trans¬ 
lation program correctly models the GSR function. 

Note: For the XC3000 family and the XC5200 family, all Hip-flops and 
latches am mset to zero after configuration. 

Preset vs. Clear (XC4000, Spartan) 

Tine XC4000 family Hip-Hops are configured as either preset (asyn¬ 
chronous set) or clear (asynchronous reset). Automatic assertion of 
the GSR net presets or clears each Hip-Hop. You can assert the GSR 
pin at any time to produce this global effect. You can also preset or 
clear individual Hip-flops with the Hip-Hop's dedicated Preset or 
Clear pin. When a Preset or Clear pin on a Hip-Hop is connected to an 
active signal, the state of that signal controls the startup state of the 
Hip-Hop. For example, if you connect an active signal to the Preset 
pin. the Hip-flop starts up in the preset state. If you do not connect the 
Clear or Preset pin, the default startup state is a clear state. To change 
the default to preset, assign an 1NIT=S attribute to the Hip-Hop. 

I/O Hip-Hops and latches do not have individual Preset or Clear pins. 
Tine default value of these Hip-Hops and latches is clear. To change the 
default value to preset, assign an IN1T=S attribute. 

Refer to your synthesis tool documentation for information on 
changing the initial state of registers that do not use the Preset or 
Clear pins. 

Increasing Performance with the GSR/GR Net 

Many designs contain a net that initializes most of the flip-flops in the 
design. If this signal can initialize all the flip-flops, you can use the 
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GSR/GR net. You should always include a net that initializes your 
design to a known state. 

To ensuie that your HDL simulation results at the RTL level match 
the synthesis results, write your code so that every flip-flop and latch 
is preset or cleared when the GSR signal is asserted. The synthesis 
tool cannot infer the GSR/GR net from HDL code. To utilize the GSR 
net, you must instantiate the STARTUP or STARTBUF block (VHDL), 
as shown in the "No GSR Implemented with Gates” figure. 

Design Example without Dedicated GSR/GR 
Resource 

In the following VHDL and Verilog designs, the RESET signal initial- 
izes all the registers in the design; however, it does not use the dedi¬ 
cated global resources. The RESET signal is muted using regular 
muting resources. These designs include two 4-bit counters. One 
counter counts up and is reset to all zeros on assertion of RESET and 
the other counter counts down and Ls reset to all ones on assertion of 
RESET. The "No GSR Implemented with Gates" figure shows the 
No GSR design implemented with gates. 

• No GSR VHDL Example 

— NO_GSR Example 

— The signal RESET initializes all registers 
— Hay 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_unsigned,ail 

entity no_gsr is 

port (CL/DCX: in STD_LOGIC; 

RESET: in STD_LOGIC; 

UPCMT: out STD_LCGIC_VECTOR 13 downto 0 ); 

DNCNT: out STD_LCGIC_VECTOR [2 downto 0 \); 
end no_gsr; 

architecture SIMPLE of no_gsr is 

nal UPJCNT: STD_LQGlC_VECTOR (2 downto 0); 
signal DNJCNT: S7D_LOGIC_VECTCR (2 downto 0); 

begin 
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U?_COUNTER: process I CLOCK, RESET) 
begin 

it (RESET = ' 1' ) then 
UP_CNT <= *0000*; 

elsif (CLOCK'event and CLOCK = *V\ then 
UP_CNT <= UP_CNT ♦ 1; 
end if; 
end process; 

EN_COUNTER: process (CLOCK, RESET) 
begin 

if (RESET a '1') then 
DN_CNT <= •llll*; 

elsif (CLOOK'event and CLOCK = ' 1*\ then 
DN_CNT <= DIL.CNT - 1f 
end if; 
end process; 

UPCOT <= UP_COT; 

DNCOT <= DM_COT; 

end SIMPLE; 


• No GR VHDL Example 

— NO_GR.VHD Exanplc 

— The signal RESET initializes all registers 
— Kithout the use cf the dedicated Global Reset 
— routing 
— December 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_legic_unsigned.ail; 


entity no_.gr 
port <CLOCK: 

RESET: 
UPCNT: 
DNCNT: 
end no_gr; 


is 

in STD_LOGIC; 
in STD_LOGIC; 

out STD — LCCIC — VECTOR (3 downtc 
out STD_LCGIC_VECTCR 13 downtc 


Ol; 

01 ); 


architecture XILINX of no_gr is 


signal UP_CMT: STD_LOGIC_VECTOR (3 dovento 0); 
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Lgnal dn_cnt: std_logic_vector (3 dovnto 0>; 
begin 

UP_COUNT£R: process [CLOCK, RESET) 
begin 

if I RESET = '1') then 
UP_CNT <= "0000"; 

elsif (CLOCK'event and CLOCK = ' 1' ) then 
UP_CNT <= UP_CNT ♦ 1; 
end if; 
end process; 

DN_COUNT£R: process [CLOCK, RESET) 
begin 

if IRESET = ' 1') then 
DN.CNT <= "ltil"; 

elsif (CLOCK 9 event and CLOCK = M' » then 
DN_CNT <= DN_CNT - 1; 
end if; 
end process; 

UPCfvT <= UP_CNT; 

DNCfvT <= DM_CfvT; 

end XXLXNX; 

• No GSR Verilog Example 

/* NO.GSR Example 

• The signal RESET initializes all registers 

* December 1997 */ 

module no_gsr ( CIOCK, RESET/ UPCNT, DNCNT); 

input CIOCK/ RESET; 
output (3:01 UPCNT; 
output 13:0] DNCNT; 

reg 13:01 UPCNT; 
reg 13:0] DNCNT; 

always (posedge CIOCK or posedge RESET) begin 
if <RESET) begin 

UPCNT = 4 9 bOOOO; 

DNCNT = 4'bllll; 
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end else begin 

UPCNT = UPCNT l'bl; 
DNCNT s DNCNT - l'bl; 

end 

end 

endncdule 


• No GR Verilog Example 

/• NO_GR.V Exarrplc 

• The signal RESET initializes all registers 

* Aug 1997 */ 

module no_gr { CLOCK r RESET, UPCNT, DNCNT); 

input CI/3CK, RESET; 
output [3:01 UPCNT; 
output [3:01 DNCNT; 

reg 13:01 UPCNT; 
reg 13:01 DNCNT; 

always <? (posedge CDDCK or posedge RESET) begin 
if (RESET) begin 

UPCNT 9 4'bOOOO; 

DNCNT = 4'bllll; 
end else begin 

UPCNT = UPCNT -f l'bl; 

DNCNT - DNCNT - l'bl; 

end 

end 
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Design Example with Dedicated GSR/GR Resource 

To reduce routing congestion and improve the overall performance of 
the reset net in the No GSR and No GR designs, use the dedicated 
GSR or GR net instead of the general purpose routing. Instantiate the 
STARTUP, STARTBUF. ROC. or TOC block in your design and use the 
GSR or GR pin on the STARTUP block (or the GSR1N pin on the 
STARTBUF block) to access the global reset net. This Ls not necessary 
with many synthesis tools. If you fully define the behavior of the GSR 
or GR net, the tool infers a STARTUP block. The modified designs 
(Use GSR and Use C.R) are included at the end of this section. The 
Use GSR design implemented with gates is shown in the 
"Active Low GSR Implemented with Gates" figure. 

In XGKXXland Spartan designs, on assertion of the GSR net, flip-flops 
return to a clear (or Low) state by default. You can override this 
default by describing an asynchronous preset in your code, or by 
adding the INrT=~l" or equivalent attribute to the flip-flop 
(described later in this section). 

In XC52011 family designs, the GR resets all flip-flops in the device to 
a logic zero. If a flip-flop is described as asynchronous preset to a 
logic 1 , the synthesis tool automatically infeis a flip-flop with a 
synchronous preset, and the Xilinx software puts an inverter on the 
input and output of the device to simulate a preset. 

The Use. C.SR and Use GR designs explicitly state that the down- 
counter resets to all ones, therefore, asserting the reset net causes this 
counter to reset to a default of all zeros. You can use one of the 
following two methods to prevent this reset to zeros. 

• Remove the comment cliaracters from the last few lines of code in 
the Use GSR or Use GR design. These lines of code correctly 
describe the behavior of the design (in response to the assertion 
of reset). However, when you synthesize the design, the Preset 
pins on the flip-flops that form the down-counter are used and 
the Clear pins on the flip-flops that form the up-counter are used. 
Using these pins defeats the purpose of using the GSR or GR net. 

• Attach the INFT = “1" or equivalent attribute to the down-counter 
flip-flops. 

Tire synthesizer may do this if necessary depending on your 
code's initialization state when the reset is applied. Refer to your 
synthesis tool documentation for more information on assigning 
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attributes. This command allows you to override the default clear 
(or Low) state when your code does not specify a preset condi¬ 
tion. However, because attributes an? assigned outside the HDL 
code, the code no longer accurately represents the behavior of the 
design. 

Xilinx recommends removing the comment characters from the last 
few lines of the Use GSR or Use GR code when you perform an RTL 
simulation and attaching the LMIT=S attribute to the relevant flip- 
flops when you synthesize the design. 

The STARTUP or STARTBUF block must not be optimized during the 
synthesis process. Add the appropriate attribute to prevent optimiza¬ 
tion before compiling your design. 

• Use GSR VHDL Example (XC4000 family) 

-- US£_GSR.VHD Example 

-- The signal RESET is connected to the 
— GSR IN pin of the STARTBUF block 
— May 1997 

library IEEE; 

library UNISIM; 

use IEEE.std_lcgic_1164.all; 

use IEEE.std_lcgic_unsigned,all; 

use UNISIM.all; 

entity usq_ gsr is 

port < CLCCK: in STD_LCGIC; 

RESET: in STD_LCGIC; 

UPCNT: cut STD_lOGIC — VECTOR (3 downto 0) ; 
DNCNT: out STD_LOGIC_VECTOR (3 downto 0)); 
end U5e_qsr; 

architecture XILINX of usc_gsr is 

component STARTBUF 

port (GSRIN: in STD_LOGlCt; 

GSROUT: out 3TD_LCGICJ; 
end conponent; 

signal RES£T_INT: S7D_LOGIC; 

signal UP_CHT: STD_LOGIC_VECTOR (3 downto 0); 
signal DN_CNT: STD_LOGIC_VECTOR (3 downto 0); 
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begin 

Ul: STARTBUF port map<G5R:N=>RESET r 

G5RCUT=>R£SET_INT>; 

U? — COUNTER: process{CLOCK, RESET_INT» 
begin 

it IRE5£T_INT = MM then 
UP_CNT <= "0000*; 

elsif CLOCK'event and CLCCK = M') then 
UP_CNT <= UP_CNT - 1; 
end if; 
end process; 

DN_COUNT£R: [ CLCCK, RESET_INT) 
begin 

if IRES£T_INT * MM then 
DN_CNT <= ■lilt"; 

els if CLOCK'event and CLCCK = M') then 
DN_CNT <= DN_CNT - 1; 
end if; 
end process; 

UP COT <= UP_COT; 

DNCOT <= DN_COT; 

end XILINX; 

• Use GR VHDL Example 


— USE_GR.VHD Version 1.0 

— Xilinx HDL Synthesis Design Guide 

— The signal RESET initializes all registers -- 

— Using the global reset resources since 

— STARTBUF block was added 

— December 1997 


library IEEE; 

library UNISIM; 

use IEEE.std_lcgic_l164.all; 

use IEEE.std_lcgic_unsigned.al1; 

use UNISIM.all; 
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entity uso.gr is 

port < CLOCK: in STD.LOGIC; 

RESET: in STD.LCGIC; 

UPCNT: out STD.LOGIC.VECTOR (3 dovento 0) ; 
DNCNT: out STD.LOGIC.VECTOR <3 dovento 0)}; 
end use_.gr; 

architecture X1LIKX of usc_.gr is 


component STAR7BUF 

port (GSRXN: in STD.LCC. . 

GSROUT: out STD.LCGIC); 
end conponcnt; 

signal RE3E7.INT: S7D_LOGIC; 

signal UP.CNT: S7D.L/DGIC.VECTOR <3 dovento 0) ; 
signal DN.CNT: STD.LOGIC.VECTOR <3 dovento 0) ; 


begin 


Ul: STARTBUF port map<GSRIN=>RESET, 

G3RaU7=>RESET.INT); 

UP.COUNTER: process (CIOCK, RESET.INT) 
begin 

if IRESET.IHT = MM then 
UP.CNT <= •OOOO"; 

elsif (CLOCK'event and CLOCK = MM then 
UP.CNT <= UP.CNT ♦ t; 
end if; 
end process; 

DN.COUNTER: process(CLOCK, RESET.INT) 
begin 

if (RE SET. I NT = MM then 
DN.CNT <= ■lilt*; 

elsif (CLOCK'event and CLOCK = MM then 
DN.CNT <= DN.CNT - l; 
end if; 
end process; 


UP COT <= UP.COT; 
CNCOT <= DN.COT; 
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end XXLXNX; 

• Use GSR Verilog Example 

ft/ftfttftft/ftff/ftff/ftff/ftfftff/ftff/ftff/f 

// U5E_GSR.V Version 1.0 // 

// The signal RESET initializes all registers // 
// Using the global reset resources (STARTUP) // 
ft December 199? // 

//////////////////////////////////////////////// 

module use_gsr ( CLCCK, RESET, UPCNT, DNCNT); 

input CI/3CX, RESET; 
output 11:01 UPCNT; 
output (1:01 DMCNT; 

reg 13:01 UPCNT; 
reg 13:0] DNCNT; 

STARTUP U1 |.GSR IRESET)) ; 

always (posedge CLOCK or posedge RESET) begin 
if (RESET) begin 

UPCNT = 4'bOOOO; 

DNCNT = 4'bl111; 
end else begin 

UPCNT = UPCNT -♦ l'bl; 

DNCNT - DNCNT - l'bl; 

end 

end 

endncdule 

• Use GR Verilog Example 

f/ftfftftfftft/ftftfft/f/fttf/ft/f/ftfftff/ftff 


ft USE_GR.V Version 1.0 // 

ft The signal RESET initializes all registers ft 
ff Using the global reset resources since // 

if STARTUP block instantiation was added // 

ft December 199? ft 


i (t f f t f i t f i f i i f i f i i f t f f f f t f f f f i f f i f i f f 11 f i f i f f i f 
module usc_gr ( CDDCX, RESET, UPCNT, DNCNT); 
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input CI/3CX, RESET; 
output 13:01 UPCNT; 
output 13:01 DNCNT; 

rcg 13:01 UPCNT; 
rcg 13:01 DNCNT; 

STARTUP Ul [ .GR<RESET) ) ; 

always @ (poscdgc CD3CK or poscdgc RESET} begin 
if (RESET) begin 

UPCNT = 4'b0000; 

DNCNT = 4'bllll; 
end else begin 

UPCNT = UPCNT 4 l'bl; 

DNCNT = DttfTNT - l'bl; 

end 


end 

endncdule 
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Design Example with Active Low GSR/GR Signal 

The Active Low .GSR design is identical to the Use GSR design 
except an INV is instantiated and connected between the RESET port 
and the STARTUP block. Also, a Set Don't Touch (or equivalent) 
attribute is added to the synthesis tool script lor both the INV and 
STARTUP, or STARTBUF (VHDL) symbols. By instantiating the 
inverter, the global set/reset signal is now active low (logic level 0 
resets all FPGA flip-flops). The inverter is absorbed into the 
STARTUP block in the device and no CLB resources are used to invert 
the signal. This is not necessary with many synthesis tools. If all regis¬ 
ters and latches are described in the RTL code as reset or set, then a 
GSR or GR is inferred. Some tools also give you the option to select 
any signal as the GSR or GR net. This allows you to correct problems 
if the RTL code does not completely describe the GSR/GR behavior. 
However, the RTL code will not match the place and route behavior 
because not all registers are described as set or reset with the GSR/ 
GR signal. Some tools provide a report of the inferred registers that 
are missing the GSR/GR behavior, and allow you to change the RTL 
beliavior. VHDL and Verilog Active. Low GSR designs are shown 
following. 

• Active Low GSR VHDL Example 


-- ACTIVE_LCW_GSR.VHD Version 1.0 

— The signal RESET is inverted before being 

— connected to the GSRIN pin of the STARTBUF 

— The inverter will bo absorbed by the STARTBUF 

— September 1997 


library IEEE; 

library UNI SIM; 

use IEEE.std_lcgic_116«.all; 

use IEEE.std_lcgic_unsigncd.al1; 

use UNISIM.all; 

entity active_low_gsr is 

port ( CLOCK: in STD_LCGIC; 

RESET: in S7D_LOGIC; 

UPCHT: out 3TD_LCGIC_VECTOR (3 dounto 0|; 
DftfTHT: out 3TD_LCCIC_VECTOR |3 downtc 0() ; 
end active_low_gsr; 
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architecture X2L1NX of act ive_lovc_gsr is 

conponcnt INV 

pert \1: in STD_LCGIC; 

O: out STD_LOGIC>; 
end ccmponcnt; 


conponcnt 3TARTECF 

pert {GSRIU: in STD — LOGIC; 

GSROUT: out STD_LOGIC); 
end ccmponcnt; 


signal R£SET_ttDT: 
signal RESET_WDT_INT: 
signal UP__CN7: 
signal DN_CN7: 


STD_LOGIC; 

STD_LOGIC; 
STD_LOGIC_VECTOR 
STD_LOGIC_VECTOR 


<3 downto 0); 
<3 downto 0); 


begin 


Ul: INV pert map<I => RESET, O => RE3ET_NOT); 


U2: STARTBUF port map<G3RIN=>RESET_NOT, 

G3ROUT=>R£SET_NOT_INT»; 


UP_COUN7£R: process (CLOCK, RESET_N07_INT) 
begin 

if lRESET_NOT_lNT = MM then 
UP_CNT <= •'0000*; 

elsif (CLOCK'event and CLOCK = MM then 
UP_CNT <= UP_CNT * 1; 
end if; 
end process; 


CN_COUNTER: process (CLOCK, RESET_NOT_INT) 
begin 

if (RESET_NOT_INT = M # J then 
DN_CNT <= -llll*; 

elsif (CLOCK'event and CLOCK = MM then 
DM_CNT <= DN_CNT - l; 
end if; 
end process; 


UPCfvT <= ’JP_CtvT; 
CNCNT <= DN_CNT; 
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end XILINX; 

• Active Low GSR Veritog Example 

///////////////////////////////////////////////////// 


// ACT I V£__1>3X_G3R. V Version 1.0 // 

// The signal RESET is inverted before being // 

// connected to the GSR pin of the STARTUP bloc // 
// The inverter will be absorbed by STARTUP in Ml // 
// September 199? // 


////////////////////////////////////////////////// 

module activc_loM_gsr ( CLOCK, RESET, UPCNT, ONCNTJ; 

input CLOCK, RESET; 

output 13:0] UPCNT; 
output (3:01 DNCNTj 

wire R£S£T_NCT; 

reg [3:0] UPCOT; 
reg [3:0] DNCOT; 

XNV Ul (.0<R£SET_N3T), .I<R£SETJ); 

STARTUP U2 (.GSR(RE3ET_NOTJ); 

always Q (pcscdgc CLOCK or posedge RE3ET_NOT» 
begin 

if (RESET_NOTJ 
begin 

UPCOT = «!' bOOOO; 

DNCOT = 4'bllllj 

end 

else 

begin 

UPCOT = UPCNT ♦ l'blj 
CNCOT = DNCNT - l'blj 

end 

end 

endncdule 
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Encoding State Machines 

The traditional methods used to generate state machine logic msult in 
highly-encoded states. State machines with highly-encoded state 
variables typically have a minimum number of flip-flops and wide 
combinatorial functions. These characteristics are acceptable for PAL 
and gate array architectures. However, because FPGAs have many 
flip-flops and narrow function generators, highly-encoded state vari¬ 
ables can result in inefficient implementation in terms of speed and 
density. 

One-hot encoding allows you to create state machine implementa¬ 
tions that are more efficient for FPGA architectures. You can create 
state machines with one flip-flop per state and decreased width of 
combinatorial logic. One-hot encoding Ls usually the preferred 
method for large FPGA-based state machine implementation. For 
small state machines (fewer than 8 states), binary encoding may be 
more efficient. To improve design performance, you can divide large 
(greater than 82 states) state machines into several small state 
machines and use the appropriate encoding style for each. 

Three design examples are provided in this section to illustrate the 
three coding methods (binary, enumerated type, and one-hot) you 
can use to create state machines. All three examples contain the same 
Case statement. To conserve space, the complete Case statement is 
only included in the binary encoded state machine example; refer to 
this example when reviewing the enumerated type and one-hot 
examples. 

Some synthesis tools allow you to add an attribute, such as 
type encoding style, to your VHDL code to set the encoding style. 
This is a synthesis vendor attribute (not a Xilinx attribute). Refer to 
your synthesis tool documentation for information on attribute- 
driven state machine synthesis. 

Note: The bold text in each of the three examples indicates the 
portion of the code that varies depending on the method used to 
encode the state machine. 

Using Binary Encoding 

Tine state machine bubble diagram in the following figure shows the 
operation of a seven-state machine that reacts to inputs A through E 
as well as previous-state conditions. Tine binary encoded method of 


4-26 


Xilinx Devtlopmen t System 




Designing FPCAs with HDL 


coding this stole machine is shown in the VHDL and Verilog exam¬ 
ples that follow. These design examples show you how to take a 
design that has been previously encoded (for example, binary 
encoded) and synthesize it to the appropriate decoding logic and 
registers. These designs use three flip-flops to implement seven 
states. 


E 



Binary Encoded State Machine VHDL Example 


— BINARY.VHD Version 1.0 

-- Exanple of .1 binary encoded state machine - 

— May 1997 

Library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_unsigncd.all; 

entity binary is 

port (CLCXTK, RESET : in STD_l/>GIC; 

A, B, C, D, E: in BOOLEAN; 

SINGLE, MULTI, CCNTIG: out 3TD_LCGIC); 
end binary; 
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architecture BEHV ot binary is 

type S7ATE_7YPE is (Si. S2, S3, 34, S5, 36, S7) ; 
attribute ENUM_ENCODI NG: STRING; 

attribute ENUM_ENCODING of 3TATE_TYP E:type is "001 010 011 100 101 110 
111 *; 

signal CS, NS: S7ATE_7YPE; 
begin 

SYNC_PROC: process (CLOCK, RESETJ 
begin 

if (RESETs'l'J then 
CS <= Si; 

elsif (CLOCK'event and CLOCK = ' 1' ) then 
CS <= NS; 
end if; 

end process; --End REG_PROC 

COMS_PROC: process (CS, A, B. C, D, E) 
begin 

case CS is 

when SI => 

MULTI <= 'O'; 

COOTIG <= ' 0* ; 

SINGLE <= 'O'; 
if (A and not B and C) then 
NS <= 32; 

elsif (A and B and not CJ then 
NS <= 34; 

else 

NS <- 31; 
end if; 
when S2 => 

MULTI <= r l # ; 

COOT1G <= ' O' ; 

SINGLE <= 'O'; 
if (not D) then 
NS <= S3; 

else 

NS <= S4; 
end if; 
when S3 => 

MULTI <= 'O'; 
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CONTIG <= MS- 
SINGLE <= 'O'; 
if (A cr D) then 
NS <= S4; 
else 

NS <= S3; 
end if; 
when 34 => 

MULTI <= M'; 

CONTIG <= MS’ 

SINGLE <= 'O'; 
if (A end B and not C\ then 
NS <= SS; 
else 

NS <= S4 ; 
end if; 
when S5 => 

MULTI <= 'l'j 
contig <= 'OS- 
single <= 'O'; 

NS <- 36; 
when S6 => 

MULTI <= 'O'; 

CONTIG <= MS- 
SINGLE <= M'; 
if (not E) then 
NS <= S7; 

else 

NS <= S6; 
end if; 
when S7 => 

MULTI <= 'O'; 

CONTIG <= MS- 
SINGLE <= 'O'; 
if (E) then 
NS <= SI; 

else 

NS <= S7; 
end if; 
end case; 

end process; — End COMB_PROC 
end BEHV; 
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Binary Encoded State Machine Verilog Example 

///////////////////////////////////////////////// 

// BINARY . V Vcr3 i on 1.0 

// Exarrple of a binary encoded state machine // 

// May 1997 // 

///////////////////////////////////////////////// 

module binary (CLOCK, RESET, A, B, C, D, E, 

SINGLE, MULT I, CONTXGJ f 


input 

CLOCK, 

RESET; 


input 

A, B, C 

, D, E; 


output 

SINGLE, 

MULTI, 

CONTIG; 

reg 

SINGLE, 

MULTI, 

CONTI G; 


// Declare the symbolic nancs for states 
parameter [ 2 : 0 ] 


31 

= 

3'b001, 

32 

= 

3'bQ10, 

S3 

— 

3'b011, 

34 

— 

3'bl00, 

35 

— 

3'bl0l, 

36 

= 

3'bl10, 

37 

— 

3'bill; 


// Declare current state and next state variables 
reg 12:01 CS; 
reg 12:01 NS; 

// state.vector CS 

always £ {posedge CLOCK or pcscdqc RESET) 
begin 

if IRESET == 1* bl > 

CS = Si; 

else 

CS = NS; 

end 

always £ (CS or A or B or C or D or D or E) 

begin 

case (CS> 

Si : 
begin 
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MULTI = 1'bO; 
CONTIG = 1'bO; 
SINGLE = 1'bO; 
if <A 44 -B 44 C) 

NS = 32 } 

else if (A 44 B 44 -C) 
NS = 34; 

else 

NS = Si; 

end 
S 2 : 
begin 


MULTI = 

l'bl; 

CONTIG = 

1'bO; 

SINGLE = 

1'bO; 

if (!D> 


NS = 

S3; 


else 

NS « S4; 

end 

53 : 
begin 

MULTI = 1'bOj 
CONTIG = l'blj 
SINGLE = 1'bOj 
if (A II D) 

NS = S4; 

else 

NS = S3; 

end 

54 : 
begin 

MULTI = l'blj 
CONTIG = l'blj 
SINGLE = 1'bO; 
if (A 44 B 44 -C» 
NS = SS; 

else 

NS = S4 ; 

end 

55 : 
begin 

MULTI = l'blj 
CONTIG = 1'bO; 
SINGLE = 1'bO; 
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NS = S6; 

end 
S6 : 
beam 



MULTI 

= 

1'bO; 


CONTIG 

= 

l'bl; 


SINGLE 

s 

l'bl; 


a <!ei 



NS 

s 

S7; 


else 




NS 

= 

S6; 

end 




S7 




beg 

in 




MULTI 

= 

1' bO; 


CONTIG 

— 

l'bl; 


SINGLE 

= 

1'bO; 


if (E) 




NS 

s 

SI? 


else 

NS = S7; 

end 

cndcase 


endncdule 


Using Enumerated Type Encoding 


The recommended encoding style for state machines depends on 
which synthesis tool you are using. Some synthesis tools encode 
better than others depending on the device architecture and the size 
of the decode logic. You can explicitly declare state vectors or you can 
allow your synthesis tool to determine the vectors. Xilinx recom¬ 
mends that you use enumerated type encoding to specify the states 
and use the Finite State Machine (FSM) extraction commands to 
extract and encode the state machine as well as to perform state mini¬ 
mization and optimization algorithms. The enumerated type method 
of encoding the seven-state machine is shown in the following VHDL 
and Verilog examples. The encoding style is not defined in the code, 
but can be specified later with the FSM extraction commands. Alter¬ 
natively, you can allow your compiler to select the encoding style that 
results in the lowest gate count when the design is synthesized. Some 
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synthesis tools automatically find finite state machines and compile 
without the need for specification. 

Note: Refer to the previous VHDL and Verilog Binary Encoded State 
Machine examples for the complete Case statement portion of the 
code. 

Enumerated Type Encoded State Machine VHDL 
Example 

Library IEEE; 

use IEEE.std_lcgic_1164.all; 
entity enun is 

port (CLOCK, RESET : in STD_LOGIC; 

A, B, C, D, E: in BOOLEAN; 

SINGLE, MULTI, CONTIG: out 3TD_LCGIC); 

end enum; 

architecture BEHV ot enun is 

type STATE_TYPE is (Si t S2, S3, 54, S5, S6, S7) ; 

signal CS, NS: STATE_TYPE; 

begin 

SYNC_PROC: process {CLOCK, RESETJ 
begin 

if IRESET='!'J then 
CS <= Si; 

elsif (CLOCK'event and CLOCK = 'l'J then 
CS <= NS; 
end if; 

end process; —End SYNC_PROC 

COMS_PROC: process <CS, A, B, C, D, E) 
begin 

case CS is 
when Si => 

MULTI <= 'O'; 

CONTIG <= 'O'; 

SINGLE <= 'O'; 
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Enumerated Type Encoded State Machine Verilog 
Example 


/ / ENUM.V Version 1.0 /. 

// Example of an enumerated encoded state machine // 
// Way 1997 // 

/////////////////////////////////////////////////// 


module 

enum (CLOCX, RESET , A, 

B, C, D, E, 



SINGLE, 

MULTI, 

CONTIG); 

input 

CLCCK, RESET; 



input 

A, 

B, C. D, E; 



output 

SINGLE, MULTI, 

CONTIG; 


reg 

SINGLE, MULTI, 

CONTIG; 


// Declare the symbol 

ic names 

for states 

parameter 

12:0] 



SI 

s 

S'bOOO, 



32 

= . 

i' bCOl, 



S3 

= . 

S' bOlO, 



34 

= . 

S' bOl1, 



35 

— 

S' bl 00, 



36 

= . 

S'blOl, 



37 

= : 

S'bllO; 



// Declare current state and 

next state variables 

reg (2 

: 0 ) 

CS; 



reg [2 

:Q| 

NS; 




// statc_vector CS 

always ft <po sedge CLOCK or pcscdgc RESET) 
begin 

if IRESET == l'bl) 

CS = Si; 

else 

CS = NS; 

end 

always ft {CS or A or B or C or D or D or E) 
begin 

case (CS) 


4-S4 


Xilinx Development System 




Designing FPGAs with HDL 


Sl : 
begin 

MULTI = 1'bO; 

CO NT IG = 1'bO; 

SINGLE = 1'bO; 
if <A -B C) 

NS = S2; 

else if <A u B u -C) 
NS = S4; 

else 

NS = Sl; 

end 


Using One-Hot Encoding 

One-hot encoding allows you lo create state machine implementa¬ 
tions that am more efficient for FPGA architectures. One-hot 
encoding is usually the preferred method for large FPGA-based state 
machine implementation. 

The following examples show a one-hot encoded state machine. Use 
this method to control the state vector specification or when you 
want to specify the names of the state registers. These examples use 
one flip-flop for each of the seven states. If you are using FPGA 
Express, use enumerated type, and avoid using the "when others” 
construct in the VHDL case statement. This construct can result in a 
very large state machine. 

Note: Refer to the previous VHDL and Verilog Binary Encoded State 
Machine examples for the complete Case statement portion of the 
code. See the “Accelerate FPGA Macros with One-Hot Approach” 
appendix for a detailed description of one-hot encoding and its appli¬ 
cations. 
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One-hot Encoded State Machine VHDL Example 

Library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_legic_unsigned.ail; 


entity onc_hot is 

port (CLOCK, RESET : in STD_LOGIC; 

A, B, C, D r E: in BOOLEAN; 

SINGLE, MULTI, CONTIG: out 3TD_LCGIC); 
end onc_hot; 


architecture BEHV of one_hot is 

type STATE__TYPE is (Si, S2, S3, 54, S5 r S6. S7); 
attribute £NUW_£NCOD ING: STRING; 

attribute ENUW_£NCODING of 3TATE_TYPE: type is "0000001 0000010 0000100 
0001000 0010000 0100000 1000000 


signal CS, NS: STATE_TYPE; 


3YNC_PROC: process (CLOCK, RESETI 
begin 

if (RESETS'1') then 
CS <= 31; 

clsif (CLOCK'event and CLOCK = '1*) 
CS <= NS; 
end if; 

end process; —End SYNC_PROC 


COXB_PROC: process (CS, A, B, C r D r E) 
begin 

case CS is 
when Si => 


MULTI <= ' O' ; 

CONTIG <= ' 0' ; 

SINGLE <= *0'; 
if (A and not B and C) then 
NS <= S2; 

elsif (A and B and not CJ then 


else 


NS <= 34; 
NS <= Si; 


then 


4-36 


Xilinx Development System 




Designing FPGAs with HDL 


end if; 


One-hot Encoded State Machine Verilog Example 

///////////////////////////////////////////// 

// ONE.HOT.V Version 1.0 

// Exanplc of a one-hot encoded state machine 
// Xilinx HDL Synthesis Design Guide for FPGAs 
// Hay 1997 

////////////////////////////////////////////// 

module one.hot (CUDCX, RESET, A, B, C. D, E, 
SINGLE, MULTI, CONTIG]; 

input CLOCK, RESET; 
input A, B, C, D, E; 
output SINGLE, MULTI, CONTIG; 

reg SINGLE, M’JLTI, CONTIG; 

// Declare the symbolic nancs for states 
parameter ( 6 : 0 ) 

31 = 7'bQOOOOOl, 

32 = 7 r bOOOOOlO, 

S3 = 7' bOOOClOO, 

34 = 7'bOOOlOQO, 

35 = 7' bOOlOOOO, 

Sfc = 7 r bC100000, 

37 = 7 r blDOOOOO; 

// Declare current state and next state variables 
reg 12:01 CS; 
reg 12:0] NS; 

// state.vector CS 

always £ {po sedge CLOCK or poscdqc RESETJ 
begin 

if IRESET == l'bl) 

CS = SI; 
else 

CS = NS; 

end 


, , 
// 
// 
u 
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always £ <CS or A cr B cr C cr D or D or E) 
begin 

case <CS) 

Si : 

begin 

MULTI = 1'bO; 

CONTIG = 1'bO; 


SINGLE = 

1'bO; 

if (A 

-B C ) 

NS = 

S2; 

else if 

IA && B 

NS = 

S4; 

else 


NS = 

Si; 


Summary of Encoding Styles 

In the three previous examples, the slate machine's possible stales ate 
defined by an enumeration type. Use the following syntax to define 
an enumeration type. 

type type name is (enumeration Jiterat (, enumeration literal ]); 

After you have defined an enumeration type, declare the signal repre¬ 
senting the states as the enumeration type as follows. 

type STATE TYPE is (SI, S2, S3, S4. S5, S6, S7); 

signal CS, NS: STATE TYPE; 

The state machine described in the three previous examples has 
seven states. The possible values of the signals CS (Current State) 
and NS (Next.State) are SI, S2,..., S6. S7. 

To select an encoding style for a state machine, specify the state 
vectors. Alternatively, you can specify the encoding style when the 
state machine is compiled. Xilinx recommends that you specify an 
encoding style. If you do not specify a style, your compiler selects a 
style that minimizes the gate count. For the state machine shown in 
the three previous examples, the compiler selected the binary 
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encoded style: S1='000", S2=~001". S3="010", S4="0ir, S5="100", 
S6="10r, and S7=”110". 

You can use the FSM extraction tool to change the encoding style of a 
state machine. For example, use this tool to convert a binary-encoded 
state machine to a one-hot encoded state machine. 

Note: Refer to your synthesis tool documentation for instructions on 
how to extract the state machine and change the encoding style. 

Comparing Synthesis Results for Encoding Styles 

Tine following table summarizes the synthesis results from the 
different methods used to encode the state machine in the three 
previous VHDL and Verilog state machine examples. The results are 
for an XC4005EPC84-2 device 

Note: The Timing Analyzer was used to obtain the timing results in 
this table. 


Table 4-1 State Machine Encoding Styles Comparison 
(XC4005E-2) 


Comparison 

One-Hot 

Binary 

Enum 

(One-hot) 

Occupied CLBs 

6 

9 

6 

CLB Flip-flops 

6 

3 

/ 

PadToSetup 

9.4 ns (3 a > 

13.4 ns (4) 

9.6 ns (3) 

ClockToPad 

15.1 ns (3) 

15.1 ns (3) 

14.9 ns (3) 

ClockToSetup 

1.3.0 ns (4) 

13.9 ns (4) 

10.1 ns (3) 


•l The number in paroithon* represent* the CIB block level delay 


Tine binary-encoded state machine lias the longest ClockToSetup 
delay. Generally, the FSM extraction tool provides the best results 
because the compiler reduces any redundant states and optimizes the 
state machine after the extraction. 
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Initializing the State Machine 

When creating a stale machine, especially when you use one-hot 
encoding, add the following lines of code to your design to ensure 
that the FPGA is initialized to a Set state. 

• VHDL Example 

SYNC.PRCXT: process (CLOCK, RESET) 
begin 

if (RESET-'1') then 
C3 <= si; 

• Verilog Example 

always £ (posedge CLOCK or posedge RESET) 
begin 

if (RESET = l'b 1> 

CS = SI; 

Alternatively, you can assign an INIT=S attribute to the initial state 
register to specif) 1 the initial state. Refer to your synthesis tool docu¬ 
mentation for information on assigning this attribute. 

In the Binary Encode State Machine example, the RESET signal forces 
the SI flip-flop to be preset (initialized to 1) while the other flip-flops 
are cleared (initialized to 0 ). 

Using Dedicated I/O Decoders 

The periphery of XC4000 family devices has four wide decoder 
circuits at each edge. The inputs to each decoder are any of the IOB 
signals on that edge plus one local interconnect per CLB row or 
column. Each decoder generates a High output (using a pull-up 
resistor) when the AND condition of the selected inputs or their 
complements is true. The decoder outputs drive CLB inputs so they 
can be combined with other logic or can be routed directly to the chip 
outputs. 

To implement XC4000 family edge decoders in HDL, you must 
instantiate edge decoder primitives. The primitive names you can use 
vary with the synthesis tool you are using. For example, you can 
instantiate DECODE 1 lO. DECODE 1 INT, DECODE4, DECODES, 
and DECODE 16. These primitives are implemented using the dedi¬ 
cated I/O edge decoders. The XC4000 family wide decoder outputs 
are effectively open-drain and require a pull-up resistor to take the 
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output High when the specified pattern is detected on the decoder 
inputs. To attach the pull-up resistor to the output signal, you must 
instantiate a PULLUP component. 

Tine following VHDL example shows how to use the I/O edge 
decoders by instantiating decode primitives. Each decoder output is a 
function of ADR (IOB inputs) and CLB INT (local interconnects). The 
AND function of each DECODE output and Chip Select (CS) serves 
as the source of a flip-flop Clock Enable pin. Tine four edge decoders 
in this design are placed on the same device edge. The "Schematic 
Block Representation of I/O Decoder" figure shows the schematic 
block diagram representation of this I/O decoder design. 

Using Dedicated 1,0 Decoders VHDL Example 

—Edge Decoder 

--An XC4000 l/Zk has special decoder circuits at each edge. These decoders 
--arc open-drained wired-AND gates. When one or nore of the inputs <I) arc 
— Low output <0} is Ix^w. When all of the inputs arc High, the output is 
--High. A pull-up resistor must be connected to the output node to achieve 
—a true logic High. 

Library IEEE; 

use IEEE.5TD_LCGIC_1164.al1; 
use IEEE.3TD_LCGIC_UNSIGNED.all; 

entity io_decodcr is 

port (ADR: in std_logic_vector (4 downto Oj; 

CS: in std_logic; 

DATA: in std_logic_vcctor 13 downto 0 ); 

CLOCK: in std_logic; 

QOUT: out std_logic_voctor <3 downto 0)»; 
end io_dccodcr; 

architecture STRUCTURE of io_dcccder is 

COMPONENT DECODE l_IO 

PORT < I: IN std_logic; 

O: OUT std_logic J; 

END COMPONENT; 

COMPCNEOT DECODE1_INT 
PORT (I: IN std_logic; 

O: OUT std_logic ); 

END COMPONENT; 
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COMPCNEtvT DECODE4 

PORT < A3, A2 r Al, AO: IN std_logic; 
O: OUT std_logic if 
END COMPONENT; 

COMPONENT PULLUP 

PORT ( O: OUT std_logic J; 

END COMPONENT; 


- Internal Signal Declarations - 

signal DECODE, CLXEN, CLB.INT: std_logic_vcctor <3 downto 0); 
signal ADR_INV, CLB_INV: std_iogic_vcctor (3 downto 0); 
begin 

ADR_INV <= not ADR (3 downto 0); 

CLB_INV <= not CLS_INT; 

. .on of Edge Decoder: Output "DECODE<0)"- 

AQ: DECODE4 port nap (ADR(3), ADR(2), ADR(1} r ADR_INV(0), 

Al: DECODE1_IO port nap (ADR(4), DECODE<0»); 

A2: DECODEl_INT port map |CLB_INV(0) t DECODE(Oj); 

A3: DECODEl_INT port map |CLB_INT(1J, DECODE<0j); 

A4: DECODEl_IOT port map lCLB_INT(2j, DECODE(Oj); 

A5: DECODEl.INT port map |CLB_INT(3), DECODE<0»); 

A6: PULLUP port map IDECODE|0)J; 

- Instantiation of Edge Decoder: Output "DECODE {l) m - 

BQ: DECODE4 port nap <ADR|3j, ADR (2), ADR.INVU). ADRlO), 

Bl: DECODEl_IO port nap (ADR(4), DECODE(lJ); 

B2: DECODE1_INT port map |CLB_IN7(0), DECODE(l»); 

B3: DECODE1_INT port map |CLB_INV(lJ, DECODE<1» ) ; 

B4: DECODEl.INT port map |CLB_IN7(2j, DECODE(1}); 

B5: DECODE 1_ I NT port map |CLB_IN7(3j, DECODE<l»); 


DECODE (0)); 


DECODE(1)>; 
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B6: PULLUP port mop (DECODE(1)); 

- Instantiation of Edge Decoder: Output "DECODE(2)■ - 

CO: DECODE4 port nop <ADR|3), ADR_INV<2), ADR 11) , ADRlO), DECODE<2»); 

Cl: DECODE1 — 10 port nop (ADR (4), DECODE<2}); 

C2: DECODE 1_ I NT port mop |CLB_INT(0), DECODE<2»); 

C3 : DECODE 1_ I NT port mop |CLB_INT(1), DECODE <2} ) ; 

C4: DECODEl.INT port mop |CLB_INV(2j, DECODE<2(); 

C5: DECODE 1_ I NT port mop lCLB_INT(3j, DECODE<2»); 

C6: PULLUP port mop |DEC0DE|2)); 

- Instantiation of Edge Decoder: Output "DECODE < 3)" - 

DO: DECODE4 port nop <ADR_XNV<3>, ADR(2), ADR11) , ADRlO), DECODE(3)); 

Dl: DECODEl.IO port nop {ADR (4), DECODE<3)); 

D2: DECODE1.:NT port mop |CLB_INT(0), DECODE<3»); 

D3: DECODE1.INT port mop |CLB_rNT(l), DECODE<3(); 

D4: DECODE1.1 NT port mop lCLB_INT(2j, DECODE<3|); 

D5: DECODE1.1 NT port mop |CLB_INV(3), DECODE<3»); 

D6: PULLUP port mop (DECODE(3)); 

-CLKEN is the AMD function of CS A DECCOE- 

CLKEN(0) <= CS ond DECODE<0»; 

CLKEN(1) <= CS ond DECODE<1J; 

CLXEM(2) <= CS ond DECODE(2) ; 

CLKEN(3) <= CS ond DECODE<3»; 

-Intcrnol 4-bit counter - 

process (CLOCK) 
begin 

if (CLCCK r event ond CLOCK=' 1'» then 
CLB.IN7 <= CLB.INT ♦ 1; 
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end if; 
end process; 

-"QCUT (0) " Date Register Enabled by "CLKEN (0) H - 

process <CLOCX) 
begin 

if (CLOCK'event and CLOCKs'1') then 
if (CLKEN(0) s 'l') then 
COUT {0) <= DATA 10); 
end if; 
end if; 
end process; 

-"QCOT(l»" Data Register Enabled by "CLKEN(1)"- 

process (CLOCK) 
begin 

if (CLOCK'event and CLOCK=MM then 
if {CLKEN (1J = 9 V) then 
COUT <1> <= DATA 111; 
end if; 
end if; 
end process; 

-"QCOT<2|" Data Register Enabled by *CLKEN(2|*- 

process <CLOCX) 
begin 

if |CLOCK'event and CLOCK=' 1' | then 
if {OLKEN(2| = '1') then 
COUT{2> <= DATA(2J; 
end if; 
end if; 
end process; 

-"QCC7(3|" Data Register Enabled by *CLKEN(3| N - 

process (CLOCK) 
begin 

if ICLCCK'event and CLOCK=M'| then 
if {CLKEN(3J = '1') then 
COUT<3) <= DATA(3J; 
end if; 
end if; 
end process; 

end STRUCTURE; 
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Figure 4-6 Schematic Block Representation of I/O Decoder 

Note: In the previous figure, the pull-up resistors are inside the 
Decoder blocks. 

Instantiating LogiBLOX Modules 

Note: Refer to the LogiBLOX Guide for detailed instructions on using 
LogiBLOX. 

Most synthesis tools can infer arithmetic modules from VHDL or 
Verilog axle for these operators: +, <, <=, >, >=, =, +1, —1. These 

adders, subtracters, comparators, incrementers, and decremented 
use FPCiA dedicated device resources, such as carry logic, to improve 
the speed and area of designs. For bus widths greater than four, 
library modules are generally faster unless multiple instances of the 
same function are compiled together. For more information on the 
module libraries, refer to your synthesis tool documentation. 

If you want to use a module that is not in the module libraries, you 
can use LogiBLOX to create components that can be instantiated in 
your axle. This is useful for large memory arrays if your synthesis 
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tool does not infer memory. However, Xilinx recommends properly 
constraining fhe synthesis and using the Xilinx-specific module 
generation capabilities of your tool. A simulation model is also 
created so that RTL simulation can be performed before your design 
is compiled. 

You can create an instance of an externally defined macro, including a 
user-defined macro or a Xilinx macro (such as an I/O or flip-flop), by 
instantiating what some synthesis tool vendors refer to as a "black 
box" in your HDLcode. These black boxes are Verilog empty module 
descriptions or VHDL component declarations. 

Some synthesis tools allow instantiation of higher order Xilinx 
macros, such as counters and adders from the Unified library. Other 
synthesis tools provide Xilinx macro libraries that pre-define the 
Xilinx macros. Without this expansion, macros are not understood by 
the implementation tools. However, Xilinx does not recommend 
using these macros. The preferred method is the synthesis tool 
module expansion, or if you require more control, you can instantiate 
a LogiBLOX module. If necessary, use these macro libraries only with 
older schematic-based designs. However, even in these cases, sche¬ 
matic-based netlists are required to expand the macros, which makes 
the macro library redundant. LogiBLOX modules should also be 
unnecessary because the synthesis tool should provide equivalent 
performance. If you find a design in which this is not true, you can 
use LogiBLOX modules, and contact Xilinx and your synthesis 
vendor for a solution. 

LogiBLOX is a graphical tool that allows you to select from several 
arithmetic, logic. I/O, sequential, and data storage modules for inclu¬ 
sion in your HDL design. Use LogiBLOX to instantiate the modules 
listed in the following table. 

Table 4-2 LogiBLOX Modules 


Module 

Description 

Arithmetic 

Accumulator 

Adds data to or subtracts it from the current value stored in 
the accumulator register 

Adder/Subtracter 

Adds or subtracts two data inputs and a carry input 

Comparator 

Compares the magnitude or equality of two values 

Counter 

Generates a sequence of count values 
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Table 4-2 LogiBLOX Modules 


Module 

Description 

Logic 

Constant 

Forces a constant value onto a bus 

Decoder 

Routes input data to l-of -11 lines on the output port 

Multiplexer 

Type 1, Type 2 - Routes input data on 1-of-n lines to the 
output port 

Simple Gates 

Type 1. Type 2. Type 3 - Implements the AND. INVERT, 
NAND. NOR. OR, XNOR. and XOR logic functions 

Tristate 

Creates a tri-stated internal data bus 

I/O 

Bi-directional Input/ 
Output 

Connects internal and external pin signals 

Pad 

Simulates an input/output pad 

Sequential 

Clock Divider 

Generates a period that is a multiple of the clock input 
period 

Counter 

Generates a sequence of count values 

Shift Register 

Shifts the input data to the left or right 

Storage 

Data Register 

Captures the input data on active clock transitions 

Memory: ROM. RAM, 
SYNC RAM, DP RAM 

Stores information and makes it readable 


Using LogiBLOX in HDL Designs 

I. Before using LogiBLOX, verify (he following. 

• Xilinx software is correctly installed 

• Environment variables are set correctly 

• Your display environment variable is set to your machine's 
display 
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2. To mn LogiBLOX, enter the following command. 

lbgui 

Tile LogiBLOX Setup Window appears after the LogiBLOX 
module generator is loaded. This window allows you to name 
and customize the module you want to create. 

3. Select the Vendor tab in the Setup Window. Select your synthesis 
tool in the Vendor Name field to specify the correct bus notation 
for connecting your module. 

Select the Project Directory tab. Enter the directory location of 
your project in the LogiBLOX Project Directory field. 

Select the Device Family tab. Select the target device for your 
design in the Device Family field. 

Select the Options tab and select the applicable options for your 
design as follows. 

• Simulation Netlisl 

Tills option allows you to create simulation netlists of the 
selected LogiBLOX module in different formats. You can 
choose one or more of the outputs listed in the following 
table. 

Table 4-3 Simulation Netlist Options 


Option 

Description 

Behavioral VHDL netlist 

Generates a simulation netlist in 
behavioral VHDL; output file has a 
.vhd extension. 

Gate level EDIF netlist 

Generates a simulation netlist in EDIF 
format; output file has an .edn exten¬ 
sion. 

Structural Verilog netlist 

Generates a simulation netlist in struc¬ 
tural Verilog; output file has a .v exten¬ 
sion. 
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• Component Declaration 

This option creates instantiation templates in different 
formats that can be copied into your design. You can select 
none, one, or both of the following options. 

Table 4-4 Component Declaration Options 


Option 

Description 

VHDL template 

Generates a LogiBLOX VHDL compo¬ 
nent declaration / instan tia lion 
template that is copied into your 

VHDL design when a LogiBLOX 
module is instantiated. The output file 
has a .vhi extension. 

Verilog template 

Generates a LogiBLOX Verilog module 
definition/instantiation template that 
is copied into your Verilog design 
when a LogiBLOX module is instanti¬ 
ated. The output file has a .vei exten¬ 
sion. 


• Implementation Netlist 


Select NGC File to generate an implementation netlist in 
Xilinx NGD binary format. You must select this option when 
instantiating LogiBLOX symbols in an HDL design. The 
output file has an .ngc extension and can be used as input to 
NGDBuild. 

• LogiBLOX DRC 

Select the Stop Process on Warning option to stop module 
processing if any warning messages are encountered during 
the design process. 

For example, if you have a Verilog design, and you are simulating 
with Verilog-XL, select Structural Verilog netlist, Verilog 
template, NGC File, and Stop Process on Warning. For a VHDL 
design and simulating with VSS, select Behavioral VHDL. VHDL 
template, NGC File, and Stop Process on Warning. 

Select OK. 

4. Enter a name in the Module Name field in the Module Selector 
Window. 
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Select a ba.se module type from the Module Type field. 

Select a bus width from the Bus Width field. 

Customize your module by selecting pins and specifying 
attributes. 

After you have completed module specification, select OK. 

Tills initiates the generation of a component instantiation decla¬ 
ration, a behavioral model, and an implementation netlist. 

5. Copy the module declaration/instantiation into your design. The 
template file created by LogiBLOX is module nante.v hi (VHDL) or 
module nante.ve i (Verilog), and is saved in the project directory as 
specified in the LogiBLOX setup. 

6. Complete the signal connections of the instantiated module to the 
rest of your design. 

Note: For mom information on simulation, refer to the "Simulating 
Your Design" chapter. 

7. Create an implementation script. Add the appropriate attribute 
to the instantiated LogiBLOX module to prevent synthesis of this 
module. Compile your design. 

Also, if you have a Verilog design, use a remove design type of 
command to make the LogiBLOX netlist unavailable before 
writing the .xnf or .edit netlist. 

Note: If you do not use a remove design type of command, the netlist 
file may be empty. If this occurs, the Xilinx software will trim this 
module/component and all connected logic. Refer to your synthesis 
tool documentation for the correct command and syntax. 

8. Compile your design and create a .xnf or .edif file. You can safely 
ignore the following type of warning messages. 

Warning: Can't find tho design in the library WORK. 
UBR-1) 

Warning: Unable to resolve reference IjigiBLOXjutme in 
design iLINK-5) 

Warning: Design design _namc has 1 unresolved references. 
For norc detailed information, use the "linlt" conmand. 
(UID-341) 
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9. Implement your design with the Xilinx tools. Verily that the .ngc 
lile created by LogiBLOX is in the same project directory as the 
netlist. 

You may get the following warnings during the NGDBuild and 
mapping steps. These messages are issued if the Xilinx software 
can not locate the corresponding .ngc file created by LogiBLOX. 

Warning: basnu - logical block lx>giBLOXJns(iince_mline of 
type lo£tB/.OX_rjiimr unexpended. Logical Design DRC 
complete with 1 warning(s). 

If you get this message, you will get the following message 
during mapping. 

ERROR:basnu - logical block U/giBLOX_instmiie_name of 
type LogiBLOX.rume ;s unexpended. Errors detected in 
general drc. 

If you get these messages, first verify that the .ngc file created by 
LogiBLOX is in the project directory. If the file is there, verify that 
the module is properly instantiated in the code. 

10. To simulate your post-layout design, convert your design to a 
timing netlist and use the back-annotation flow applicable to 
your synthesis tool. 

Note: For more information on simulation, refer to the "Simulating 
Your Design" chapter. 

Instantiating a LogiBLOX “Black Box” Component 

Tire VHDL example in this section shows how to instantiate a Logi¬ 
BLOX "black box" component. 

VHDL Example 

entity top is 

port (elk, rst, cn, data: in bit; q: out bit); 
end top; 

architecture structural of top is 

— Declare the black_box as a boolean attribute 
attribute blackybox: boolean; 

— Declare the black_box_pad_pin as a string attribute 
attribute black_box_pad_pin: string; 
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— In this example, GIZWD is a user macro that I created 

— in a schematic editor, and 

— that I want to directly instantiate in my VHDL design 

— as a black box. Create a conponcnt declaration, 
conponcnt GIZMD 

port(Q: out bit; D, C, CLR: in bit}; 
end component; 

— Set the black_box attribute on GIZMO to be "true", 
attribute black_box of GIZM3: component is true; 

— In this example, HYBUF is a user I/O nacro that I created 

— in a schcnatic editor, and 

— that I want to directly instantiate in my VHDL design 

— as a black Dcx. Create a conponcnt declaration, 
conponcnt MYBUF 

port( 0 : out bit; I: in bit); 
end component; 

— Set the black_box_pad_pin attribute on MYBUF to 

— the pin that interlaces with the external world, 'I*, 
attribute black_box_pad_pm of MYBUF: component is "I"; 

signal data_core: bit; 

begin 

— Instantiate an MYBUF. Here we connect 

— data to I and data_coro to O. 

data_pad: MYBUF port map IO -> data_core, I => data); 

— Instantiate a G12MQ. Here we connect q to Q, 

— data_core to D, elk to C, and rst to CLR. 
my__gizmo: GIZWD port map 10 = > cj# D => data.corc, 

C => elk, CLR => rst); 

end structural; 

Implementing Memory 

XC4000E/EX/XL/XLA and Spartan FPGAs provide distributed on- 
chip RAM or ROM CLB function generators can be configured as 
ROM (ROM16X1, ROM32X1); level-sensitive RAM (RAM 16X1, RAM 
32X1); edge-triggered, single-port (RAM16X1S, RAM32X1S); or dual¬ 
port (RAM 16x1 D) RAM Level sensitive RAMs are not available for 
the Spartan family. The edge-triggered capability simplifies system 
timing and provides better performance for RAM-based designs. This 
distributed RAM can be used for status registers, index registers. 
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counter storage, constant coefficient multipliers, distributed shift 
registers. LIFO stacks, latching, or any data storage operation. The 
dual-port RAM simplifies FIFO designs. 

Note: For more information on XC4000 family RAM. refer to the 
Xilinx Web site (http://support.xilinx.com) or the current release of 
The Programmable Ligic Dirfit Boat. 

Implementing XC4000 and Spartan ROMs 

ROMs can be implemented as follows. 

• Use RTL descriptions of ROMs 

• Instantiate 16x1 and 32x1 ROM primitives 

• Use LogiBLOX to implement any other ROM size 

VHDL and Verilog examples of an RTL description of a ROM follow. 

RTL Description of a ROM VHDL Example 

— Behavioral 16x4 ROM Example 
— rcm_rt 1 .vhd 


library IEEE; 

use IEEE.std_lcgic_l164.all; 
entity rort_rtl is 

port (ADDR: in INTEGER range 0 to 1 ; 

DATA: out STD_LCGIC_VECTOR <3 downto 0)); 
end ron_rt 1 ; 

architecture XILINX of rom_rtl is 

subtype RC«_»DRD is STD_LDGIC.VECTOR (3 downto 0); 
type ROM_TA3LE is array <0 to 15J of RCM_KORD; 
constant RCM: RCM_TABLE := RC«_TABLE* < 

RCW_WDRD / < H 0Q00 H ) , 

RCM_TORD' <"0001") , 

RCM_WDRD' <"001CT) , 

RCM_WDRD / ("0100") , 

RCM_WDRD' <" 1000 H J r 
RCM_WDRD' <" 1100"| f 
RCW_WDRD / <" 1Q1CT) r 
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RCM_WDRD' <-1001") 
RCM_UORD' <-1001-1 
RCM_WDRD' <-1010-1 
RCM_WORD' <-1100-1 
RCM_WDRD' <"1001-1 
RCM_WDRD' <-1001") 
RCM.WDRD' <-1101-) 
RC«_WDRD / <-1011-) 
RCM_UORD' <-1111") > ; 


begin 

DATA <= ROM (ADDR); 


— Read fron the ROM 


end XILINX; 

RTL Description of a ROM Verilog Example 

/• 

• RCM_RTL.V 

• Behavioral Example of 16x4 ROM 
*/ 


module rom_rtl(ADDR. DATA) ; 
input (3:0] ADDR ; 
output (3:0] DATA ; 


reg |3:0] DATA ; 


// A memory is implemented 


// using a case 

always O(ADDR) 
begin 

case (ADDR) 

4'bOOOO : 
4'bOOOl : 
4'b0010 : 
4'bOOll : 
4'b0100 : 
4'b0101 : 
4'b0110 : 
4'b0111 : 
4'bl000 : 
4'bl001 : 
4'bl010 : 
4'bl011 : 
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statement 


DATA 

— 

4'bOOOO 

; 

DATA 

— 

4'bOOOl 

; 

DATA 

— 

4'b0010 

; 

DATA 

— 

4'b0100 

; 

DATA 

— 

4'bi000 

; 

DATA 

— 

4'bi000 

; 

DATA 

— 

4'bil00 

; 

DATA 

— 

4'bi010 

; 

DATA 

s 

4'bi001 

; 

DATA 

— 

4'bi001 

; 

DATA 

— 

4'bi010 

; 

DATA 

— 

4' bl 100 

i 
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4'bllOO : 

DATA 

= 4'bl001 

; 

4'bl101 : 

DATA 

= 4'blODl 

; 

4'blliO : 

DATA 

= 4'bll01 

; 

4'bllll : 

DATA 

= 4'bllll 

; 


endcasc 

end 

endncdule 

When using an RTL description of a ROM. the synthesis tool creates 
ROMs from random logic gates that are implemented using function 
generators. 

Another method for implementing ROMs is instantiating the 16x1 or 
32x1 ROM primitives. To define the ROM value, use the Set Attribute 
or equivalent command to set the 1N1T property on the ROM compo¬ 
nent. 

Note: Refer to your synthesis tool documentation for the correct 
syntax. 

Tills type of command writes the ROM contents to the netlist file so 
the Xilinx tools can initialize the ROM. The 1NIT value should be 
specified in hexadecimal values. See the V'HDL and Verilog RAM 
examples in the following section for examples of this property using 
a RAM primitive. 

Implementing XC4000 Family RAMs 

Do not use RTL descriptions of RAMs in your code because they do 
not compile efficiently and can cause combinatorial loops. The excep¬ 
tion to this is if your synthesis tool can infer memory. In this case, you 
must follow a strict coding style. Refer to your vendor's documenta¬ 
tion for more information. 

You can implement RAMs as follows. 

• Instantiate 16x1 and 32x1 RAM primitives (RAM16X1, 

RAM.32X1, RAM16X1S. RAM32X1S, RAM 16X1D) 

• Use LogiBLOX to implement any other RAM size 

• Some synthesis tools can infer RAMs from your code 
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When implementing RAM in XC4000 and Spartan designs. Xilinx 
recommends using the synchronous write, edge-triggered RAM 
(RAM 16X1S. RAM32X1S, or RAM16X1D) instead of the asynchro¬ 
nous-write RAM (RAM 16X1 or RAM32X1) to simplify write timing 
and increase RAM performance. 

Examples of an instantiation of edge-triggered RAM primitives are 
provided in the following VHDL and Verilog designs. As with ROMs, 
initial RAM values can be specified from the command line. Tire 1N1T 
property value is specified in hexadecimal values. Refer to your 
synthesis tool documentation for the correct command and syntax. 

An Exemplar'' example of a RAM inference (ram.vhd) is also 
included in this section. Check with your synthesis tool vendor for 
the availability of this feature. 

Instantiating RAM VHDL Example 


— RAM_?RIMITIVE.VHD 

-- Exanplc of instantiating 4 

— 16x1 synchronous RAMs — 

— HDL Synthesis Design Guide for FPGAs -- 

— Hay 1997 


library IEEE; 

use IEEE.std_lcgic_l164.allf 


entity ram_primitivc is 

port < DATA_IN, ADDR : in STD_LOGIC_VECTCR<3 dewnto 0); 
WE, CLOCK : in STD_LOGIC; 

DATA-OUT : out STD_LCGI C_VECTOR (3 dovento 0>); 

end ran_prinitive; 


architecture SUUJCTURAL_RAM of ram_pnmit ivc is 
component RAMI6X1S 

port <D, A3, A2, Al, AO, KE, XCLK : in STD_LOGIC; 
O : out STD_LOGIC); 
end component; 
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begin 

RAMO : RAMI 6X1S port map IO => DATA_OUT<0», D => DATA_IN<0), 

A3 => ADDR (3], A2 => ACDR(2) r 
Ai => ADOR(U, AO => ACDR (Oj , 

WE => ME, KCLX => CLOCK); 

RAMI : RAMI 6X1S port map (O => DATA_OUT<l», D => DATA. X Nil I, 

A3 => ACDR (3} r A2 => ACDR(2) r 
Ai => ACDR C11r AO => ACDR(Oj , 

WE => ME, KCLX => CLCXTK); 

RAM2 : RAMI 6X1S port map IO => DATA_OUT<2), D => DATA_IN(2), 

A3 => ACDR(3} r A2 = > ADDR(2j, 

Ai => ACDR(1} r AO => ACDR(0) , 

WE => ME, KCLX => CLCXTK); 

RAM3 : RAMI 6X IS port map IO => DATA_OUT<3», D => DATA_IN(3j, 

A3 => ACDR(3) r A2 => ACDR(2) r 
Ai => ACDR(1) , AO => ACDR(0) r 
WE => ME, KCLX => CLCXTK); 

end 3TRCXT7URAL_RAM; 

Instantiating RAM Verilog Example 


// RAM_PRIMITI VE. V // 

// Example of instantiating 4 // 

// 16x1 Synchronous RAMs // 

// HDL Synthesis Design Guide for FPGAs // 

// August 1997 // 


////////////////////////////////////////// 

module ram_primitive IDATA.IN, ACDR, KE, CLOCK, DATA.OUT); 

input 13:0] DATA.IN, ACDR; 
input WE, CLCCK; 

output 13:0] DATA_OUT; 

RAMI6XIS RAMO (.0(DATA-OUT(0]J, .D(DATA_1N[0|} , .A3<ADDR [ 3]), 
.A2<ADDR[2]), .Ai(ADDRII]}, .A0<ADDR[0]), 

.ME {KE| , .WCLK |CLCXTK) ) ; 
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RAMI6XIS RAMI (.O<DATA_OUT { 11J r .0(DATA-IN [ 1) ) , .A3 <ADDR(3]), 

. A2 {ADDR [ 2 ] ) , .Al (ADDR 11 ] ) , . AO {ADDR{Cl]) , 

. ME {WE) , .WCLK l CLOCK) J ; 

RAMI 6X IS RAM2 ( .O (DATA_OUT (21J r .D (DATA-IN |21 ) , . A3 {ADDR f 3)) , 

.A2<ADDRf2)J, .Al(ADDR11’}, .AO{ADDR [ 0]), 

.ME, .WCLK{CLOCK)); 

RAMI6XIS RAH3 (.O(DATA-OUT(31J r .D(DATA-IN[3]}^ .A3{ADDR [ 3]), 

.A2<ADDRf2]J, .Al(ADDRI1! 1 , .AO<ADDR[0]), 

.ME<WE). .WCLKICLCCK)J ; 

endncdule 

Inferring RAM VHDL Example 

library iccc; 

use icec.std_lcgic_ll6i.all; 

library cxcnplar; 

use exemplar,cxcnplar_l164.al 1 ; 

library cxcnplar; 

use exemplar.cxcnplar.all; 

package my_pkg is 

type MEM_WDRD is array 16 downto 0) of clbit_vector (1 downto 0); 
end my_pkg; 

=20 

library cxcnplar; 

use exemplar.cxcnplar.all; 

use work.my_pkg.al 1 ; 

entity mem is 

port {dio : inout elbit_vcctor ll downto 0};=20 

mcnc, we, inclk, outclk : in bit; 
addr : integer range 6 downto 0 ; 

ro : out bit}; 

attribute clcck_node : boolean; 

attribute clcck_node of inclk : signal is TRUE; 
attribute clcck_node of outclk : signal is TRI 

=20 

end men; 

architecture bchav of men is 
signal nem : MEM_NORD; 
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signal d_int : clbit_vcctor (1 downto 0 ); 
begin 

process {inclk} 
begin 

if |inclk'event and inclk =3D 'I'J then 

if (ncme =3D 'V and we =3D '1*> then 
ncm (adar) <=3D dio; 
end if; 
end if; 
end process; 
process {outclk) 
begin 

if loutclk'event and outclk =3D '1') then 
d_int <=3D men laddrj; 

end if; 
end process; 

dio <=3D d_int when (acme =3D 'V and we =3D * O'> else "ZZ"; 
end fcchav; 

Using LogiBLOX to Implement Memory 

If you must instantiate memory, use LogiBLOX to create a memory 
module larger than 32X1 (16X1 for Dual Port). Implementing 
memory with LogiBLOX is similar to implementing any module with 
LogiBLOX except for defining the Memory initialization file. Use the 
following steps to create a memory module. 

Note: Refer to the "Using LogiBLOX in HDL Designs" section for 
more information on using LogiBLOX. 

1. Before using LogiBLOX, verify the following. 

• Xilinx software is correctly installed 

• Environment variables are set correctly 

• Your display environment variable is set to your machine's 
display 

2. To run LogiBLOX, enter the following command. 

lbgui 

The LogiBLOX Setup Window appears after the LogiBLOX 
module generator is loaded. This window allows you to name 
and customize the module you want to create. 
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3. Select Hie Vendor lab in the Setup Window. Select your synthesis 
tool in the Vendor Name field to specify the correct bus notation 
for connecting your module. 

Select the Project Directory tab. Enter the directory location of 
your project in the LogiBLOX Project Directory field. 

Select the Device Family tab. Select the target device for your 
design in the Device Family field. 

Select the Options tab and select the applicable options for your 
design. 

Select OK. 

4. Enter a name in the Module Name field in the Module Selector 
Window. 

Select the Memories module type from the Module Type field to 
specify that you are creating a memory module. 

Select a width (any value from 1 to (*t bits) for the memory from 
the Data Bus Width field. 

In the Details field, select the type of memory you are creating 
(ROM, RAM, SYNC RAM, or DP. RAM). 

Enter a value in the Memory Depth field for your memory 
module. 

Note: Xilinx recommends (this is not a requirement) that you select a 
memory depth value that is a multiple of 16 because this is the 
memory size of one lookup table. 

5. If you want the memory module initialized to all zeros on power 
up. you do not need to create a memory file (Mem File). 
However, if you want the contents of the memory initialized to a 
value other than zero, you must create and edit a memory file. 
Enter a memory file name in the Mem File field and click on the 
Edit button. Continue with the following steps. 

Note: Some memory modules can only be initialized to zero. Refer to 
the Xilinx Programmable Logie Data Book for more information. 

a) A memory template file in a text editor is displayed. This file 
does not contain valid data, and must be edited before you 
can use it. The data values specified in the memory file Data 
Section define the contents of the memory. Data values are 
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specified sequentially, beginning with the lowest address in 
the memory, as defined. 

b) Specify the address of a data value. The default radix of the 
data values is 16 . If more than one radix definition is listed in 
the memoiy file header section, the last definition is the radix 
used in the Data Section. 

Tine following definition defines a 16-word memory with the 
contents 6,4, 5,5,2,7,5,3,5,5,5,5,5,5,5.5, starting at 
address 0. Note that the contents of locations 2,3,6, and 8 
through 15 are defined via the default definition. Two 
starting addresses, 4 and 7, are given. 

depth 16 
default 5 
data 6,4, 

4: 2, 7, 

7: 3 

c) After you have finished specifying the data for the memory 
module, save the file and exit the editor. 

6. Click the OK button. Selecting OK generates a component 
instantiation declaration, a behavioral model, and an implemen¬ 
tation netlist. 

7. Copy the HDL module declaration/instantiation into your HDL 
design. The template file created by LogiBLOX is 

module ftance.vhi for VHDL and module uamc.vei for Verilog, and 
is saved in the project directory as specified in the LogiBLOX 
setup. 

8. Complete the signal connections of the instantiated LogiBLOX 
memory module to the rest of your HDL design, and complete 
initial design coding. 

9. Perform a behavioral simulation on your design. For more infor¬ 
mation on behavioral simulation, refer to the "Simulating Your 
Design" chapter. 

10. Create an implementation script. Add a Set Don't Touch or equiv¬ 
alent attribute to the instantiated LogiBLOX memory module, 
and compile your design. 

Also, if you have a Verilog design, use a remove design type of 
command before writing the .xnf or .edif netlist. 
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Note: It you do not use this type of command, the netlist file may be 
empty. If this occurs, the Xilinx software will trim this module/ 
component and all connected logic. Refer to your synthesis tool docu¬ 
mentation for the correct syntax. 

11 . Compile your design and create a .xnf or .edif file. You can safely 
ignore the following type of warning messages. 

Warning: Can't find tho design in the library WORK. 
UBR-1) 

Warning: Unable to resolve reference LagSBLOXjtame in 
design _name. | LI NX-5) 

Warning: Design design _na me has 1 unresolved references. 
For more detailed information, use the “link" conmand. 
tUID-341) 

12 . Implement your design with the Xilinx tools. Verify that the .ngc 
file created by LogiBLOX is in the same project directory as the 
netlist. 

You may get the following warnings during the NGDBuild and 
mapping steps. These messages are issued if the Xilinx software 
can not locate the corresponding .ngc file created by LogiBLOX. 

Warning: basnu - logical block LogiBLOXJnStancejMme of 
type LogiBLOX^name is unoxpanded. Logical Design DRC 
complete uith 1 warningts). 

If you get this message, you will get the following message 
during mapping. 

ERROR:basnu - logical block lAgiBLOXJnSfaiKejtttine of 
type Ib£ifii.OX_nimt > is unexpended. Errors detected in 
general drc. 

If you get these messages, first verify that the .ngc file created by 
LogiBLOX is in the project directory. If the file is there, verify that 
the module is properly instantiated in the code. 

13. To simulate your post-layout design, convert your design to a 
timing netlist and use the back-annotation flow applicable to 
your synthesis tool. 

Note: For more information on simulation, refer to the "Simulating 
Your Design" chapter. 
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Implementing Boundary Scan (JTAG 1149.1) 

Note: Refer to the Development System Reference Guide for a detailed 
description of the XC4000/XC5200 boundary scan capabilities. 

XC4000, Spartan, and XC5200 FPC.As contain boundary scan facilities 
that are compatible with IEEE Standard 1149.1. Xilinx devices 
support external (I/O and interconnect) testing and have limited 
support for internal self-test. 

You can access the built-in boundary scan logic between power-up 
and the start of configuration. Optionally, the built-in logic is avail¬ 
able after configuration if you specify boundary scan in your design. 
During configuration, a reduced boundary scan capability (sample/ 
preload and bypass instructions) is available. 

In a configured FPGA device, the boundary scan logic is enabled or 
disabled by a specific set of bits in the configuration bitstream. To 
access the boundary scan logic after configuration in HDL designs, 
you must instantiate the boundary scan symbol, BSCAN, and the 
boundary scan I/O pins, TDI, TMS, TCK, and TDO. 

The XC5200 BSCAN symbol contains three additional pins: RESET, 
UPDATE, and SHIFT, which are not available for XC4000 and 
Spartan. These pins represent the decoding of the corresponding state 
of the boundary scan internal state machine. If this function is not 
used, you can leave these pins unconnected in your HDL design. 

Instantiating the Boundary Scan Symbol 

To incorporate the boundary scan capability in a configured FPGA 
using synthesis tools, you must manually instantiate boundary scan 
library primitives at the source code level. These primitives include 
TDI. TMS, TCK, TDO. and BSCAN. The following VHDL and Verilog 
examples show how to instantiate the boundaiy scan symbol, 
BSCAN. into your HDL code. Note that the boundary scan I/O pins 
arc not declared as ports in the HDL code. The schematic for this 
design is shown in the "Bod scan Schematic” figure. 

You must assign a Set Don't Touch or equivalent attribute to the net 
connected to the TDO pad before using the Insert Pads (or equiva¬ 
lent) and compile commands. Otherwise, the TDO pad is removed by 
the compiler. In addition, you do not need IBUFs or OBUFs for the 
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TDI. TMS, TCK, and TDO pads. These special pads connect directly 
to the Xilinx boundary scan module. 

Boundary Scan VHDL Example 

library IEEE; 

use IEEE.std_lcgic_ll64.all; 
use IEEE.std_lcgic_unsigncd.all; 

entity bnd_scan is 

port (TDI_P, TT4S_P f TCK_P : in STD_LCCI j 
LO AD_P, CE_P, CLOCK_P, RESET_P: in 
STD_LOGIC; 

DATA_P: in STD_LOGIC_VECTOR(3 downto 0>; 
TDO_P: out STD_LOGIC; 

CCtfT_P: out 5TD_LCGIC_VECTOR<3 downto 0 J f; 
end bnd_scan; 

architecture XILINX of bnd_scan is 


conponcnt BSCAN 

pert {TDI, 7MS, TCK out 3TD_LCCIC; 
TDO: in STD_LCGI 
end ccmponcnt; 

conponcnt TDI 

port {2! out STD_LOGIC); 
end component; 


conponcnt TMS 

port <1: out STD_LOGIC); 
end component; 

conponcnt TCK 

port 42: out STD_LOGIC); 
end ccmponcnt; 

conponcnt TDO 

pert {O: out STD_LOGIC); 
end component; 


conponcnt count4 

pert {LOAD, CE, CLOCK, R3T: in STD_LOGIC; 

DATA: in STD_LOGIC_VECTOR (3 downto 0); 
COUT: out STD_LOGIC — VECTOR I 3 downto 0) ) ; 
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end component; 

— Defining signals to connect BSCAN to Pins -- 
qnal 7CK.NET : STD.LCC: | 
signal IDI.NET : STD.LCC: ; 
signal 7MS.NET : 5TD.L0GXC; 
signal TCO.NET : 3TD.LCGXC; 

begin 

Uls BSCAN port map <7DC => 7DO.NET, 

TDX => 7DI.NET r 
TMS => 7MS.NET, 

TCK => TCK.NET); 

U 2: TDI port map [1 =>7DI_NET); 

U3: TCK port map |1 =>7CK_NET); 

U4: TMS port map lI =>7MS.NET); 

U5: TDO port map (O =>TDO.NET); 

U6: count4 port map |LOAD => LOAD.P, 

CE => CE_P r 

CLOCK => CLQCK.P, 

RST => RESET.P, 

DATA => DA7A.P, 

COUT => COUT.PJ; 

end XXLXNX; 

Boundary Scan Verilog Example 


tf BMD.SCAN.V // 

ft Example of instantiating the B5CAM synbol in // 
// activating the Boundary Scan circuitry // 

tf Count4 is an instantiated .v file of a counter // 
ft September 1997 ft 


tftfftfttfifttfifttftfftftfftftffiftfftfftftfftitfftf 

module bnd.scan 11/JAD.P, CLCCK.P, C£_P r RESET.P, 
DA7A.P, COUT.P); 

input LOAD.P^ CLOCK.P, CE.P/ RESET.P; 
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input (3:0] DATA.P; 
output (3:0] COUT_P; 

wire TD1_.NET, TMS.NET, TCK.NE, TDO.NET; 

B3CAN U1 {.TDO (TDO.NET), .TDI(TDI.NET>, .TMS { TMS.NETJ, .TCK(TCX.NET)); 
TDI V2 [. I (TDI.NET)); 

TCK U3 I .X (TCX.NET ]) ; 

TMS U4 | .I (TMS.NET)) ; 

ICO U5 1.0(TDO.NET\) ; 

count4 U6 (. LOAD <LOAD.P} , . CLOCK 1CI-CCK.F I , .CE (CE.P) , 

.RST(RESET.P), .DATA(DATA.P), .CCOT(CCXJT.PJ); 

endncdule 

CE P( 

CLOCK P( 

DATA P<3:0> [ 

LOAD P ( 

RESET P( 


XW41 

Figure 4-7 Bnd_scan Schematic 
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Implementing Logic with lOBs 

You can move logic that is normally implemented with CLBs to lOBs. 
By moving logic from CLBs to lOBs. additional logic can be imple¬ 
mented in the available CLBs. Using lOBs also improves design 
performance by increasing the number of available routing resources. 

Tire XC4000 and Spartan devices have different IOB functions. The 
following sections provide a general description of the IOB function 
in XCKXIOE/EX/XLA/XL/XV and Spartan devices. A description of 
how to manually implement additional I/O features is also provided. 

XC4000E/EX/XLA/XL/XV and Spartan lOBs 

You can configure XC4000E/EX/XLA/XL/XV and Spartan lOBs as 
input, output, or bidirectional signals. You can also specify pull-up or 
pull-down resistors, independent of the pin usage. 

These various buffer and I/O structures can be inferred from 
commands executed in a script or in your synthesis tool. The Set Port 
Is Pad (or equivalent) command in conjunction with the Insert Pads 
(or equivalent) command creates the appropriate buffer structure 
according to the direction of the specified port in the HDL code. You 
can add attributes to these commands to further control pull-up, pull¬ 
down. and clock buffer insertion, as well as slew-rate control. Some 
tools operate on I/Os by selecting a chip level (inserts I/O) or module 
level (no I/O) synthesis. Also, you can add synthesis tool attributes, 
such as BUFFER SIC, to ports in your VHDL code to control inser¬ 
tion of I/Os. 

Inputs 

Tire buffered input signal that drives the data input of a storage 
element can be configured as either a flip-flop or a latch. Addition¬ 
ally. the buffered signal can be used in conjunction with the input 
flip-flop or latch, or without the register. 

To avoid external hold-time requirements. IOB input flip-flops and 
latches have a delay block between the external pin and the D input. 
You can remove this default delay by instantiating a flip-flop or latch 
with a NODELAY attribute. The NODELAY attribute decreases the 
setup-time requirement and introduces a small hold time. 
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If an IOB or register is instantiated in your HDL code, you may not be 
able to use the Set Port Is Pad (or equivalent) command on that port. 
Doing so may automatically infer a buffer on that port and create an 
invalid double-buffer structure. Tills varies with the tool you are 
using. Check with your synthesis vendor to see if partial instantiation 
interferes with automatic I/O insertion or the use of IOB registers. 

Registers that connect to an input or output pad and require a Direct 
Clear or Preset pin are not implemented by the synthesis tool in the 
IOB. The Y'HDL emulation of GSR or C.R on these registers prevents 
them from being pulled into the IOB. The VHDL emulation of GSR/ 
GR through direct clear or preset pins is described in the "Simulating 
Your Design" chapter. If GSR/GR behavior is not completely 
described, automatic inferencing of GSR/GR does not occur. In this 
case, instantiate STARTBUF in VHDL, and fully describe the GSR/ 
GR behavior except for registers that you want in the IOB. In VHDL, 
these registers do not initialize pre-route, but do indicate X's until the 
first data is registered. However, they do initialize properly during 
back-annotation. Verilog models initialize properly and do not inter¬ 
fere with the automatic use of IOB registers instead of CLB registers. 

Outputs 

Tine output signal that drives the programmable tristate output buffer 
can be a registered or a direct output. Hie register is a positive-edge 
triggered flip-flop and the clock polarity can be inverted inside the 
IOB. (Xilinx software automatically optimizes any inverters into the 
IOB.) The XC40I10 and Spartan output buffers can sink 12 mA. Two 
adjacent outputs can be inter-connected externally to sink up to 
24mA. 

Note: Most FPGA synthesis tools can optimize flip-flops attached to 
output pads into the IOB. However, some of these tools cannot opti¬ 
mize flip-flops into an IOB configured as a bidirectional pad. Refer to 
your synthesis tool documentation for more information. 

Slew Rate 

Refer to your synthesis tool documentation for information on 
configuring I/O's, including how to control slew rate. 
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Pull-ups and Pull-downs 

XC4Q00 and Spartan devices have programmable pull-up and pull¬ 
down resistors available in the I/O regardless of whether it is config¬ 
ured as an input, output, or bi-directional I/O. By default, all unused 
lOBs are configured as an input with a pull-up resistor. The value of 
the pull-ups and pull-downs vary depending on operating conditions 
and device process variances but should be approximately 50 K 
Ohms to 100 K Ohms. If a more precise value is required, use an 
external resistor. Refer to your synthesis tool documentation for 
information on how to specify internal pull-up or pull-down I/O 
resistors. 

XC4000EX/XLA/XLXV Output Multiplexer/2-lnput 
Function Generator 

A function added to XC4000EX/XLA/XL/XV families is a two input 
multiplexer connected to the lOB output allowing the output clock to 
select either the output data or the IOB clock enable as the output 
pad. This allows you to share output pins between two signals, effec¬ 
tively doubling the number of device outputs without requiring a 
larger device or package. Additionally, this multiplexer can be config¬ 
ured as a two-input function generator allowing you to implement 
any 2-input logic function in the IOB thus freeing up additional logic 
resources in the device and allowing for very fast pin-to-pin data 
pa tits. 

To use the output multiplexer (OMUX), you must instantiate it in 
your code. See the following VHDL and Verilog examples. Instantia¬ 
tion of the other types of two-input output primitives (such as 
OAND2. OOR2, and OXOR2) are similar to these examples. 

Note: Since the OMUX uses the IOB output clock and clock enable 
routing structures, the output flip-flop (OFD) can not be used within 
the same IOB. The input flip-flop (1FD> can be used if the clock enable 
is not used. 

• Output Multiplexer VF1DL Example 


— OW-'X_EXAMPLE. VHD 

— Exanplc of CMUX instantiation 

— Foe an XC4000EX/XL/XV device 

— HDL Synthesis Design Guide for FPGAs -- 

— August 1997 
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library IEEE; 

use IEEE.std_lcgic_1164.all; 
entity omux_cxanplo is 


port (DATA-IN: in STO_LOGlC_VECTOR II downtc 0}; 
SEL: in STD_ LOGIC; 

DATA-OUT: out STD_LGGIC); 

end cmux_cxamplc; 

architecture XILXNX of onux_exanplc is 
conponcnt CMUX2 

pert <D0, DI r 30 : in STD_L0GIC; 

O : out STD_LOGICJ; 

end component; 

begin 


DUEL_OUT: OHUX2 port nap <C=>DA7A_OUT, 

D0=>DATA_IM(Oj, Dl=>DATA_IN(1), S0=>SELJ; 


end XILINX; 

• Output Multiplexer Verilog Example 

////////////////////////////////////////// 


// OMJX_EXAMPLE. V // 

// Example of instantiating an OHUX2 // 
// in an XC400QEX/XL IOB // 

// HDL Synthesis Design Guide for FPGAs ft 
ft August 1997 // 


fftfftftfftftfftftfftffffifftflfffftfftfff 

module cmux_cxample <DATA_IN, SEL, DATA_OUT) ; 

input 11 : 0 ] DATA_IM ; 
input SEL ; 

output DATA-OUT ; 

0MCX2 DUEL_OOT <.0<DA7A_CUri , .DO<DA7A_:M(01), 
.Dl(DATAMINI 11) , .SOlSELJ); 
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endncdule 


XC5200 lOBs 

XC5200 IOBs consist of an input buffer and an output buffer that can 
be configured as an input, output, or bi-directional I/O. The structure 
of the XC5200 is similar to the XC4000IOB except the XC5200 does 
not contain a register/latch. The XC5200 IOB has a programmable 
pull-up or pull-down resistor, and two slew rate control modes (Fast 
and Slow) to minimize bus transients. The input buffer can be 
globally configured to TTL or CMOS levels, and the output buffer can 
sink or source 8.0 mA. 

I/O buffer structures (as with the XC4000 IOBs) can be inferred from 
your synthesis tool script with the Set Port Is Pad (or equivalent) 
command in conjunction with the Insert Pads (or equivalent) 
command. Controlling pull-up and pull-down insertion and slew 
rate control are performed as previously described for the XC4000 
IOB. 

The XC5200 IOB also contains a delay element so that an input signal 
that is directly registered or latched can have a guaranteed zero hold 
time at the expense of a longer setup time. You can disable this 
(equivalent to NODELAY in XC4000) by instantiating an IBUF F 
buffer for that input port. This only needs to be done for ports that 
connect directly to the D input of a register in which a hold time can 
be tolerated. 

Bi-directional I/O 

You can create bi-directional I/O with one or a combination of the 
following methods. 

• Behaviorally describe the I/O path 

• Structurally instantiate appropriate IOB primitives 

• Create the I/O using LogiBLOX 

Xilinx FPGA IOBs consist of a direct input path into the FPGA 
through an input buffer (IBUF) and an output path to the FPGA pad 
through a tri-stated buffer (OBUFI). The input path can be registered 
or latched; the output path can be registered. If you instantiate or 
behaviorally describe the I/O, you must describe this bi-directional 
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path in two steps. First, describe an input path from the declared 
1 NOUT port to a logic function or register. Second, describe an output 
path from an internal signal or function in your code to a tri-stated 
output with a tri-state control signal that can be mapped to an 
OBUFT. 

You should always describe the I/O path at the top level of your 
code. If the I/O path is described in a lower level module, your 
synthesis tool may incorrectly create the I/O structure. 

Inferring Bi-directional I/O 

This section includes VHDL and Verilog examples that show how to 
infer a bi-directional I/O. In these examples, the input path is latched 
by a CLB latch that is gated by the active high READ WRITE signal. 

The output consists of two latched outputs with an AND and OR, 
and connected to a described tri-state buffer. The active low 
READ WRITE signal enables the tri-state gate. 

• Inferring a Bi-directional Pin VHDL Example 


— BIDIR_INFER.VHD 

-- Example of inferring a Bi-dirccticnal pin 

— August 1997 


Library IEEE; 

use IEEE.3TD_LCC:C_1164.al1; 
use IEEE.3TD_LCGIC_UNSIGNED.all; 


entity bidir_infer is 


port (DATA : inout STD_LGG:c_VECTOR(1 dovento Cl); 

READ_WRITE : in STD_LOGIC>; 


end bidir_infer; 

architecture X1LXNX of bidir_infer is 

signal LATCH_OUT : STD_lX)GIC_VECTORI1 downtc 0); 


begin 

process|READ_WR1TE, DATA) 
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begin 

it |READ_WR1TE = '1') then 
LATCH_OUT <= DATA; 
end if; 

end process; 

process|READ_WR1TE, LA7CH_OUT) 
begin 

if |READ_WRITE = ' O' ) then 

DATA 10) <= LATCH_OUT(Oj end LATCH_OUT|1); 

DATA|1) <= LATCH_OU7(0) or LATCH_OUT(1I; 

else 

DATA <01 <= 'Z'; 

DATA <1 \ <= ' Z p ; 
end if; 

end process; 

end XILINX; 

• Inferring a Bi-directional Pin Verilog Example 

/////////////////////////////////////////////////////////////////// 

// BID:r_INF£R.V Version 1.1 // 

// This is an example of an inference cf a bi-directional signal. // 
// Note: bogie descrip : port should always bo on top-level // 

// code when using Synopsys Compiler and verilog. // 

// March 1996 // 

//////////////////////////////////////////////////////////////////// 

module bidir_infer (DATA, READ_MRI7£); 

input R£AD_MRI7£ ; 

inout 11:0] DATA ; 

reg 11:0] LATCH_OUT ; 

always £ { R£AD_KRITE or DATA] 
begin 

if |READ_WR1TE == l'bi) 

LATCH — OUT <= DATA; 

end 
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assign DATA[0) = READ_MRIT£ ? 1'bZ : (LATCH_OUT 101 i LATCH_OUT( 1 M } 
assign DATA[1] = READ_XRIT£ ? 1'bZ : { LATCH_OUT [ 0] | LATCH_OUT [ 1 ]); 


endncdule 


Instantiating Bi-directional I/O 

Instantiating the bi-directional I/O gives you more control over the 
implementation of the circuit; however, as a result, your code is mote 
architecture-specific and usually more verbose. The VHDL and 
Verilog examples in this section are identical to the examples in the 
"Inferring Bi-directional I/O" section; however, since there is more 
control over the implementation, an input latch is specified rather 
than the CLB latch inferred in the previous examples. The following 
examples are a more efficient implementation of the same circuit. 

When instantiating I/O primitives, do not specify the Set Port Is Pad 
(or equivalent) command on the instantiated ports to prevent the I/O 
buffers from being inferred by your synthesis tool. This precaution 
also prevents the creation of an illegal structure. 

• Instantiation of a Bi-directional Pin VHDL Example 


— BIDIR_INSTAN7IATE.VHD 

— Example of an instantiation 
-- of a Bi-dircctional pin 

— August 1997 


Library IEEE; 

use IEEE.3TD_LCCIC_1i64.a 11; 
use IEEE.3TD_LCGIC_UNS1GN£D.all; 


entity bidir_instantiatc is 

port (DATA : inout STD_LCGIC_VECTOR (1 downto 0); 

READ_WRITE : in STD_LOGIC); 


end bidir_instantiatc; 

architecture X1LXNX of bidir instantiatc is 


signal LATCH_OUT : STD_LOGIC_VECTOR11 downto 0) ; 
signal DATA_OLT : STD_LOGIC_VECTCRI1 downto 0 ); 
signal GATE : STD_LOGIC; 
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conponcnt ILD_1 

pert <D, G : in STD_LOGIC; 

0 : out STD_LOGICJ; 

end component; 

conponcnt OBUrT_S 

pert U# T : in STD_LOGIC; 

O : out STD_LOGICJ; 

end component; 

fccqin 


DATA_COT<0) <- LATCH_OUT <0) end LATCH_CUT (1) ; 
DA7A_CU7 (1) <= LA7CH_OUT<0) cr LATCH_OUTl1); 

GATE <= not READ_MRI7E; 


INPUT_PATH_0 

: 1LD_1 

pert map 

<D => DATA(0 ), G => GATE/ 

Q => LATCH_OU7(0)); 

INPUT_PATH_1 

: 1LD_1 

pert map 

<D => DATA(1), G => GATE/ 

Q => LATCH..OUT(1J > ; 

CUPUT_PATH_0 

2 OBUFT_S 

pert map 

<1 => DATA_OUT|0)/ T => READJWRI7E/ 
O => DATA(0)>; 

CUPOT_PATH_1 

: OBUFT_S 

pert map 

<1 => DATA_OUT(1)/ T => READJWRI7E/ 
O => DATA(1)); 


end XXLXNX; 

• Instantiation of a Bi-directional Pin Verilog Example 


//////////////////////////////////////////// 

// BIDIR_INSTAMTIATE.V // 

// This is an example of an instantiation // 

// of a bi-directional port. // 

// August 1997 // 


//////////////////////////////////////////// 


module bidir_instantiate <DATA, READ_WRITE); 
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input 


READ_MRI7£ 

inout 

[1:01 

DATA ; 

reg 

11:0) 

LATCH_OUT ; 

wire 

11:0) 

DATA-OUT ; 

wi re 


GATE ; 


assign GATE = -R£AD_XRITE; 


assign DATA_OOT 10] = LATCH_OUT(01 £ LATCH_OUT(11; 
assign DATA_OCT11J = LATCH_OUT(0 j | LATCH_OUT(11; 

// I/O primitive instantiation 

ILD_1 IMPUT_PATH_0 {.QlLATCH_OUT(0)J , .DIDATA(011 , .G(GATE)>; 

ILD_1 INPUT_PATH_1 {. Q ILATCH_OUT(1)J, .DIDATA(111 r .GIGATE)); 

OBUrT_S OUPU7_PATH_0 (.O<DA7A(0) J , .1<DA7A_CU7|0]>, .T<READ_WRI7E)»; 
OBUrT_S OUPUT_PATH_l ( .0 {DA7A (i ) J , .I<DA7A_CXJT11 ]) , . T<READ_WRI7E)\; 


cndncdule 


Using LogiBLOX to Create Bi-directional I/O 

You on use LogiBLOX lo create I/O structures in an FPGA. Logi¬ 
BLOX gives you the same control as instantiating I/O primitives, and 
is usually less verbose. LogiBLOX is especially useful for bused I/O 
ports. 

Note: Refer to the "Using LogiBLOX in HDL Designs" section 
section, for details on creating, instantiating, and compiling Logi¬ 
BLOX modules. 

Do not use the Set Port Is Pad (or equivalent) command on Logi- 
BLOX-created ports. Also, when designing with Verilog. you must 
issue a Remove Design or equivalent command before writing out 
the .xnf files from your synthesis tool. 

Tine following VHDL and Verilog examples show how to instantiate 
bi-directional I/O created with LogiBLOX. These examples produce 
the same results as the examples in the "Instantiating Bi-directional 
I/O" section. 
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• Using LogiBLOX to Create a Bi-directional Port VHDL Example 


— BIDIR_LCGIBLOX.VHD 

Example of using LcgiBLOX — 

to create a Bi-directional pert — 

— August 1997 


— LcgiBLOX BIDI Module *bidir_io_frorr^lb" 

— Created by LogiBLOX version Ml.3.7 

on Mon Sop ft 13:14:02 1997 

— Attributes 

MODTYPE = BIDI 
BUS_WIDTH = 2 
IN_TYPE = LATCH 
OUT TYPE = TRI 


Library IEEE; 

use IEEE.3TD_LCGIC_1164.all; 
use IEEE.3TD_LCGIC_UNSIGNED.all; 

entity bidir_lcgiblcx is 

port (DATA : inout STD_LOGICJVECTOR(1 dovento 0); 

READ_WRITE : in STD_LOGIC>; 

end bidir_logiblox; 

architecture XILINX of bidir_logibiox is 

signal LATCH_OUT : STD_1>DGIC_VEC70R 1 1 downto 0»; 
signal DATA_Ct7T : S7D_I>DG IC_ VECTOR 1 1 dounto 0\ ; 


— Component Declaration 


component bidir_io_fron_lb 


O: 

IN 

3TD__LCGIC_VECTOR (1 

DOMNTO 

0); 

OE: 

IN 

3TD_LCGIC; 



IGATE: 

IN 

3TD_LCGIC; 



10: 

OUT 

3TD__LCG I C_VECTOR < 1 

DGWNTO 

0); 

P: 

INOUT 

STD_LO3:C_VECT0R < 1 

DGWNTO 

0)1; 


end component; 


Synthesis and Simulation Design Guide 


4-77 










Synthesis and Simulation Design Guide 


begin 

DA7A_CU7<0» <= LA7CH_OUT<0) and LATCH_COT (1) ; 
DA7A_C07< 1) <= LA7CH_OUT<0) or LATCH_OUT 11) ; 


— Corcponent Instantiation 


BIDIR_BUSSED_PORT : bidir_io_fron_lb 

pert map (O => DATA_OUT, OE => READ_MRITE, 

IGA7E => READ_KRITE, 10 => LATCHJDUT, P => DATA); 


end XXLXNX; 

• Using LogiBLOX to Create a Bi-directional Port Verilog Example 

tftffiftfftfitftffffttliftffffiffifftfiffif 
ft bidir_lcg:blox.v // 

// This is an example of using LogiBLOX // 

// to create a bi-directional port. // 

ft August 1997 ft 

tftffififftfttftfUfiffiftffffiffififfiffff 

tf - 

// LegiBLCX BIDI Module *bidir_io_frorrL.ib" 
tf Created by LogiBLOX version Ml. 3.7 
// on Mon Sep 8 17:10:15 1997 
tf Attributes 
// HODTYPE = BIDI 

// BUS_WIDTH = 2 

// XN_TYP£ = LATCH 

t f OUT_TYPE = TR1 

tf ..-.-.... 


module bidir_lcgiblcx (DATA, READ_MRITZJ; 

input R£AD_MRI7£ ; 

i.nout (1:0) DATA ; 

reg 11:0) LATCH_OU7 ; 
wire 11:0) DATA-OUT ; 


assign DATA-OUT 10] - LATCH_CUT £ 0 1 £ LATCH_OUT (11; 
assign DATA_OUT 111 = LATCH_CUT(01 | LATCH_OUT£ 1 ] ; 
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ft IvsgiB1/3X instantiation 

bidir_io_f ron_lb BIDIR_BUSSED_PORT 

{ .O <DATA-OUTJ, .OE(READ_WR17E) , .P(DATA ), 

.IQ<LA7CH_OUT) , .IGATE(READ_WRITE>); 

endncdule 

module bidir_io_from_ib [ O, OE r P, 10. IGATE); 
input (1:0] O; 

input OE; 

input XGATE; 

inout (1:0] P; 

output (1:0] IQ; 

endncdule 

Specifying Pad Locations 

Although Xilinx recommends allowing the software to select pin 
locations to ensure the best possible pin placement in terms of design 
timing and routing resources, sometimes you must define the pad 
locations prior to placement and routing. You can assign pad loca¬ 
tions either from your synthesis tool's script prior to writing out the 
netlist file, or from a User Constraints File (UCF). Use one or the other 
method, but not both. Refer to your synthesis tool documentation for 
the correct syntax for configuring your I/O with the PLOC properly. 
Also, refer to The Programmable Logic Data Book or the Xilinx Web site 
(http://support.xilinx.com) for the pad locations for your device and 
package. 

Moving Registers into the IOB 

Note: XC5200 devices do not have input and output flip-flops. 

IOBs contain an input register or latch and an output register. IOB 
inputs can be register or latch inputs as well as direct inputs to the 
device array. Registers without a direct reset or set function can be 
moved into IOBs. Moving registers or latches into IOBs may reduce 
the number of CLBs used and decreases the routing congestion. In 
addition, moving input registers and latches into the IOB reduces the 
external setup time, as shown in the following figure. 
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Figure 4-8 Filoving Registers into the IOB 

Although moving output registers into the IOB may increase the 
internal setup time, it may reduce the clock-to-output delay, as shown 
in this figure. Most FPGA synthesis tools automatically move regis¬ 
ter into IOBs if the Preset, Clear, and Clock Enable pins are not used. 

Use -pr Option with Map 

Use the -pr (pack registers) option when running MAP. The -pr {i o 
b) (input output I both) option specifies to the MAP program to 
move registers into IOBs under the following circumstances. 

1 . Tlie input of the register must be connected to an input port, or 
the Q pin must be connected to an output port. For the 
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XC40Q0EX/XL/XV this applies lo non-I/O latches, as well as 
flip-flops. 

2 . IOBs must have input or output flip-flops. XC52O0 devices do not 
have lOB flip-flops. 

3. Tile flip-flop does not use an asynchronous set or reset signal. 

4. In XC4000, Spartan, and XC3000 devices, a flop/latch is not 
added to an lOB if it has a BLKNM or LOC conflict with the lOB. 

5. In XC4000 or Spartan devices, a flop/latch is not added to an 
lOB if its control signals (clock or clock enable) are not compat¬ 
ible with those already defined in the lOB. This occurs when a 
flip-flop (latch) is already in the lOB with different clock or clock 
enable signals, or when the XC4000EX/XL/XV output MUX is 
used in the same IOB. 

6. In XC4000EX/XV devices, if a constant 0 or 1 is driven on the 
lOPAD, a flip-flop/latch with a CE is not added to the input side 
of the IOB. 

Using Unbonded IOBs (XC4000E/EX/XLA/XL/XV and 
Spartan Only) 

In some package/device pairs, not all pads are bonded to a package 
pin. You can use these unbonded IOBs and the flip-flops inside them 
in your design by instantiating them in the HDL code. You can imple¬ 
ment shift registers with these unbonded IOBs. The VHDL and 
Verilog examples in this section show how to instantiate unbonded 
IOB flip-flops in a 4-bit shift register in an XC4(XX) device. 

Note: The synthesis tool compilers cannot infer unbonded primi¬ 
tives. Refer to your synthesis tool documentation for a list of library 
primitives that can be used for instantiations. 

4-bit Shift Register Using Unbonded VO VHDL 
Example 


— UNSCNDED_IO.VHD Version 1.0 

— XC4000 LCA has unbonded IOBs which have 


storage elements that can he used to build — 
shift registers. — 
Below is a 4-bit Shift Register using — 
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— Unbonded IOB Flip Flops 

— Xilinx HDL Synthesis Design Guide tor FPGAs — 
~ May 1997 


Library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_unsigned.all; 

entity unbonded_io is 

port (A, B: in 5TD_LOGIC; 

CLX: in S7D_1/>GIC; 

Q_OUT: out 3 TD_LCXjIC) ; 
end unbondcd_io; 

architecture XILINX of unbonded_io is 

conponcnt 1FD_U — Unbonded Input FF with INIT=Rcset 
port (Q: out std_lcgic; 

C: in std_iogic); 

end component; 

conponent 1FDI_U — Unbended Input FF with INIT=Sct 
port ( 0 : out std_icgic; 

D r C: in std_logic); 
end component; 

conponcnt CFD_U — Unbonded Output FF with IN:T=Rcset 
port ( 0 : out std_Icgic; 

D r C: in std_logic); 
end component; 

conponcnt OFDI_U — Unbended Output FF with IMIT=Sct 

port ( 0 - out std_logic; 

D, C: in std_logic); 
end component; 

- Internal Signal Declarations - 

signal U_Q : STD_LCGIC_VECTCR 13 downto 0»; 
signal U_D : STD_LOGIC; 

begin 

U_D <= A and B; 

Q—OUT <= U_Q<0»; 
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U3: OFD_U 

port 

map 

10 

= > 

U_0<3> 




D 

= > 

U-P. 




C 

=> 

CLK); 

U 2: IFDI_U 

port 

map 

10 

= > 

U_0<2> 




D 

= > 

O_0<3> 




C 

= > 

CLK); 

Ul: OFDX_U 

port 

map 

10 

= > 

U_0(1) 




D 

= > 

u_0<2) 




C 

= > 

CLK): 

U0: IFD_U 

port 

map 

10 

= > 

u_0<0) 




D 

= > 

u_0(l) 




C 

=> 

CLK); 


end XXLXNX; 


4-bit Shift Register Using Unbonded I/O Verilog 
Example 


tf XC4000 family has unbonded IOBs which have ft 

ft storage elements that can be used to build // 


// functions lie shift registers. // 

ft Below is a 4-bit Shift Register using Unbonded // 
// XOB Flip Flops tf 

tf HDL Synthesis Design Guide for FPGAs ft 

ft May 199? ft 


t f i f f i f t f 11 f if f i f t f t i f f t f t f t f f t f f f f t f f t f f f 11 f t f f i f f f 

module unbonded_io (A, B, CLK, Q_OUTJ; 

input A r B r CLK; 
output 0_CUT; 

wire 1 3 :01 U_Q; 
wire U_D; 

assign U_D = A 4 B; 
assign Q_CC7 = U_010); 

OFQ_U U3 <-0IU_0[3]), •D <U_D ), .C(CLK)); 
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.C{CLK}); 
-C{CLK}); 
.C{CLK}); 


endncdulc 

Implementing Multiplexers with Tristate Buffers 

A 4-lo-l multiplexer is efficiently implemented in a single XC4000 or 
Spartan family CLB. Tine six input signals (four inputs, two select 
lines) use the F, G, and H function generator;. Multiplexers that are 
laiger than 4-to-l exceed the capacity of one CLB. For example, a 16- 
to-1 multiplexer requires fiveCLBs and has two logic levels. These 
additional CLBs increase area and delay. Xilinx recommends that you 
use internal tristate buffers (BUFTs) to implement large multiplexers. 

Large multiplexers built with BUFTs have the following advantages. 

• Can vaiy in width with only minimal impact on area and delay 

• Can have as many inputs as there are tristate buffers per hori¬ 
zontal longline in the target device 

• Have one-hot encoded selector inputs 

This last point is illustrated in the following VHDL and Verilog 
designs of a 5-to-l multiplexer built with gates. Typically, the gate 
version of this multiplexer has binary encoded selector inputs and 
requires three select inpuLs (SEL<2;0>). The schematic representation 
of this design is shown in the “5-to-l MUX Implemented with Gates" 
figure. 

Some synthesis tools include commands that allow you to switch 
between multiplexers with gates or with tristates. Check with your 
synthesis vendor for more information. 

The VHDL and Verilog designs provided at the end of this section 
show a 5-to-l multiplexer built with tristate buffers. The Instate 
buffer version of this multiplexer has one-hot encoded selector inputs 
and requires five select inputs (SEL<4:0>). The schematic representa¬ 
tion of these designs is shown in the "5-to-l MUX Implemented with 
BUFTs" figure. 
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Mux Implemented with Gates VHDL Example 

— HUX_GA7E. VHD 

-- 5-to-l Mux Implemented in Gates 

— May 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_arith.all; 

entity mux_gate is 

port (SEL: in 3TD_LCGIC_VECTOR [2 downto 0); 

A, B,C,D,E: in 5TD_LCGIC; 

SIG: out STD_LCGIC); 
end mux_gatc; 

architecture RTL of mux_gatc is 
begin 

SEL_PROCESS: process (SEL, A, B, C, D, E) 
begin 


case SEL is 

when • 000 " 

=> 

SIG 

< = 

A; 

when 

•001" 

=> 

SIG 

<= 

b; 

when 

•010" 

=> 

SIG 

<= 

C; 

when 

•011" 

=> 

SIG 

<= 

D; 

when 

others 

=> 

SIG 

< = 

E; 


end case; 


end process SEL_PRCCESS; 
end RTL; 


Mux Implemented with Gates Verilog Example 

/• HUX_GA7E.V 
* Hay 1997 */ 

module mux_gatc IA,B,C,D,E,SEL,SIGl; 

input A, B, C,D,E; 
input (2:0) SEL; 
output SIG; 
reg SIG; 

always £ (A or Bor C or D or SEL) 
case (SEL) 

3'bOQO: 
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SIG-A; 
3'b001: 

SIG-B; 
3* bOlD: 

SIG-C; 
3'b011: 

SIG=D; 

3'blOO: 

SIG-E; 

default: 3IG-A; 
cndcase 

endncdule 



Xf22f 

Figure 4-9 5-lo-1 MUX Implemented with Gates 

Mux Implemented with BUFTs VHDL Example 

— MUX_TBUF.VHD 

— 5-to-l Mux Implemented in 3-Statc Buffers 

— May 1997 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_arith.all; 


entity mux_tbuf is 

port (SEL: in 3TD_LCGIC_VECTOR |4 downto 0 \; 
A,B,C,D,E: in 3TD_LCGXC; 

SIG: out STD_LCGIC); 
end mux_t bu f; 


architecture RTL of mux_tbuf is 
begin 


SIG <= A when <SEL(0)-'0'l else *Z 
SIG <= B when <S£L(1]='0'J else 'Z 
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SIG <= C when <3£L(2j='0' ) else 'Z' 
SIG <= D when <5£L(2)='0'J else ' Z r 
old <= E when <3£L(4J='0'J else *Z f 


end RTL; 


Mux Implemented with BUFTs Verilog Example 

/* HUX_TBUF.V 
* May 1997 */ 

module mux_tbuf IA,B,C,D,£,3EL r SIGI; 

input A,B, C, D,E; 
input (4:0] SEL; 
output SIG; 
reg SIG; 

always & <S£L or A] 
begin 

it lSEL[0]==l'b0j 
SIG-A; 

else 

SlOl'bz; 

end 

always 0 (SEL or B) 
begin 

if <SEL(l]==l'bO) 

SIG=B; 

else 

SIG=l'bz; 

end 

always 0 (SEL or C) 
begin 

if <SEL(2|==l'bO> 

SIG=C; 

else 

SIG=1' bz; 

end 

always 0 (SEL or D) 
begin 

if <SEL(3]==l'b0) 
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3IG=D; 

else 

3IG=l'bz; 

end 

always G (SEL or E) 
begin 

if <SEL(<H==l'bO> 
3IG-E; 

else 

SIG=l*bz; 

end 


endncdule 



Figure 4-10 5-to-1 MUX Implemented with BUFTs 
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A comparison of liming and area for a 5-to-l multiplexer buill with 
gates and tristate buffers in an XC4005EPC84-2 device is provided in 
the following table. When the multiplexer is implemented with 
tristate buffers, no CLBs are used and the delay is smaller. 

Table 4-5 Timing/Area for 5-to-1 MUX (XC4005EPC84-2) 


Timing/Area 

Using BUFTs 

Using Gates 

Timing 

1531 ns (1 block level) 

1756 ns (2 block levels) 

Area 

0 CLBs, 5 BUFTs 

3 CLBs 


Using Pipelining 

You can use pipelining to dramatically improve device performance. 
Pipelining increases performance by restructuring long data paths 
with several levels of logic and breaking it up over multiple clock 
cycles. This method allows a faster clock cycle and, as a result, an 
increased data throughput at the expense of added data latency. 
Because the Xilinx FPGA devices are register-rich, this is usually an 
advantageous structure for FPGA designs because the pipeline is 
created at no cost in terms of device resources. Because data is now 
on a multi-cycle path, special considerations must be used for the rest 
of your design to account for the added path latency. You must also 
be careful when defining timing specifications for these paths. 

Some synthesis tools have limited capability for constraining multi¬ 
cycle paths, or translate these constraints to Xilinx implementation 
constraints. Check your synthesis tool documentation for information 
on multi-cycle paths. If your tool cannot translate the constraint but 
can synthesize to a multi-cycle path, you can add the constraint to the 
UCF file. 

Before Pipelining 

In the following example, the clock speed is limited by the clock-to 
out-time of the source flip-flop; the logic delay through four levels of 
logic; the routing associated with the four function generators, and 
the setup time of the destination register. 
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Figure 4-11 Before Pipelining 

After Pipelining 

This is an example of Ihe same dala palh in Ihe previous example 
aflet pipelining. Because Ihe flip-flop is contained in the same CLB as 
Ihe function generator. Ihe clock speed is limited by Ihe clock-lo-oul 
lime of Ihe source flip-flop; Ihe logic delay through one level of logic; 
one routing delay; and Ihe selup lime of Ihe destination register. In 
this example, Ihe system clock runs much faster than in Ihe previous 
example. 





Figure 4-12 After Pipelining 

Design Hierarchy 

HDL designs can either be synthesized as a flat module or as many 
small modules. Each methodology lias its advantages and disadvan¬ 
tages, but as higher density FPGAs are created, the advantages of 
hierarchical designs outweigh any disadvantages. 

Advantages to building hierarchical designs are as follows. 

• Easier and faster verification/simulation 

• Allows several engineers to work on one design at the same time 
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• Speeds up design compilation 

• Reduces design time by allowing design module re-use lor this 
and future designs. 

• Allows you to produce designs that are easier to understand 

• Allows you to efficiently manage the design flow 
Disadvantages to building hierarchical designs are as follows. 

• Design mapping into the FPGA may not be as optimal across 
hierarchical boundaries; this can cause lesser device utilization 
and decreased design performance 

• Design file revision control becomes more difficult 

• Designs become more verbose 

Most of the disadvantages listed above can be overcome with careful 
design consideration when choosing the design hierarchy. 

Using Synthesis Tools with Hierarchical Designs 

By effectively partitioning your designs, you can significantly reduce 
compile time and improve synthesis results. Here are some recom¬ 
mendations for partitioning your designs. 

Restrict Shared Resources to Same Hierarchy Level 

Resources that can be shared should be on the same level of hier¬ 
archy. If these resources are not on the same level of hierarchy, the 
synthesis tool cannot determine if these resources should be shared. 

Compile Multiple Instances Together 

You may want to compile multiple occurrences of the same instance 
together to reduce the gate count. However, to increase design speed, 
do not compile a module in a critical path with other instances. 

Restrict Related Combinatorial Logic to Same Hierarchy Level 

Keep related combinatorial logic in the same hierarchical level to 
allow the synthesis tool to optimize an entire critical path in a single 
operation. Boolean optimization does not operate across hierarchical 
boundaries. Therefore, if a critical path is partitioned across bound¬ 
aries, logic optimization is restricted. In addition, constraining 
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modules is difficult if combinatorial logic is not restricted to the same 
level of hierarchy. 

Separate Speed Critical Paths trom Non-critical Paths 

To achieve satisfactory synthesis results, locate design modules with 
different functions at different levels of the hierarchy. Design speed is 
the first priority of optimization algorithms. To achieve a design that 
efficiently utilizes device area, remove timing constraints from design 
modules. 

Restrict Combinatorial Logic that Drives a Register to Same 
Hierarchy Level 

To reduce the number of CLBs used, restrict combinatorial logic that 
drives a register to the same hierarchical block. 

Restrict Module Size 

Restrict module size to 100 - 200 CLBs. ThLs range varies based on 
your computer configuration; the time required to complete each 
optimization run; if the design is worked on by a design team; and 
the target FPGA routing resources. Although smaller blocks give you 
more control, you may not always obtain the most efficient design. 
For the final compilation of your design, you may want to compile 
fully from the lop down. Check with your synthesis vendor for 
guidelines. 
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Register All Outputs 

Arrange your design hierarchy so that registers drive lire module 
output in each hierarchical block. Registering outputs makes your 
design easier to constrain because you only need to constrain the 
clock period and the ClockToSetup of the previous module. If you 
have multiple combinatorial blocks at different levels of the hier¬ 
archy, you must manually calculate the delay for each module. Also, 
registering the outputs of your design hierarchy can eliminate any 
possible problems with logic optimization across hierarchical bound¬ 
aries. 

Restrict One Clock to Each Module or to Entire 
Design 

By restricting one clock to each module, you only need to describe the 
relationship between the clock at the top level of the design hierarchy 
and each module clock. By restricting one clock to the entire design, 
you only need to describe the clock at the top level of the design hier¬ 
archy. 

Note: See your synthesis tool documentation for more information 
on optimizing logic across hierarchical boundaries and compiling 
hierarchical designs. 
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This chapter describes simulation methods for verifying the func¬ 
tional timing of your designs. It includes the following sections. 

• "Introduction" 

• "Functional Simulation" 

• "Timing Simulation" 

• "Using VHDL/Verilog Libraries and Models" 

• "Simulating Global SignaLs" 

• "Adapting Schematic Global Signal Methodology for VHDL" 

• "Setting VHDL Global Set/Reset Emulation in Functional Simu¬ 
lation" 

• "Using Oscillators (VHDL)" 

• "Compiling Verilog Libraries" 

• "Setting Verilog Global Set/Reset" 

• "Setting Verilog Global Tristate (XC4000. Spartan, and XC5200 
Outputs Only)" 

Introduction 

Xilinx supports functional and timing simulation of HDL designs at 
the following three points in the HDL design flow as shown in the 
following figuie. 

• Register Transfer Level (RTL) simulation which may include the 
following. 

• Instantiated UniSim library components 
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• LogiBLOX modules 

• LogiCORE models 

• Post-synthesis functional simulation with one of the following. 

• Gate-level UniSim library components 
or 

• Gate-level pre-route Sim Prim library components 

• Post-implementation back-annotated timing simulation with the 
following. 

• SimPrim library components 

• Standard Delay Format (SDF) file 



The three primary simulation points can be expanded to allow for 
two additional post-synthesis simulations, as shown in the following 
table. These two additional points can be used when the synthesis 
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tool either cannot write V'HDL or Verilog, or il the netlist is not in 
terms of UniSim components. 

Table 5-1 Five Simulation Points in HDL Design Flow 


Simulation 

UniSim 

LogiBLOX 

Models 

SimPrim 

SDF 

1 . 

RTL 

X 

X 



131 

Post-Svnthesis 

• 

X 

X 



a 

Functional Posl-NC.DBuild 
(Optional) 



X 



Functional Post-MAP 
(Optional) 



X 

X 

5. 

Post-Route Timing 



X 

X 


These simulation points are described in detail in the 'Functional 
Simulation" section and the "Timing Simulation" section. The 
libraries required to support the simulation flows are described in 
detail in the "Using VHDL/ Verilog Libraries and Models" section. 
Tine new flows and libraries now support closer functional equiva¬ 
lence of initialization behavior between functional and timing simu¬ 
lations. This is due to the addition of new methodologies and library 
cells to simulate GSR/GTS behavior. 

It is important to addiess the built-in reset circuitry behavior in your 
designs starting with the first simulation to ensure that the simula¬ 
tions agree at the three primary points. 

If you do not simulate GSR behavior prior to synthesis and place and 
route, your RTL and possibly post-synthesis simulations will not 
initialize to the same state as your post-route timing simulation. As a 
result, your various design descriptions are not functionally equiva¬ 
lent and your simulation results will not match. In addition to the 
behavioral representation for GSR. you need to add a Xilinx imple¬ 
mentation directive. This directive is used to specify to the place and 
route tools to use the special purpose GSR net that is pre-routed on 
the chip, and not to use the local asynchronous set/reset pins. Some 
synthesis tools can identify, from the behavioral description, the GSR 
net, and will place the STARTUP module on the net to direct the 
implementation tools to use the global network. However, other 
synthesis tools inteipret behavioral descriptions literally, and will 
introduce additional logic into your design to implement a function. 
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Without specific instructions to use device global networks, the Xilinx 
implementation tools will use general purpose logic and interconnect 
resources to redundantly build functions already provided by the 
silicon. 

Even if GSR behavior is not described, the actual chip initializes 
during configuration, and the post-route netlist will have this net that 
must be driven during simulation. Hie "Simulating Global Signals" 
section includes the methodology to describe this behavior, as well as 
the GTS behavior for output buffers. 

Xilinx VHDL simulation supports the VITAL standard. This standard 
allows you to simulate with any VlTALcompliant simulator, 
including MTI/Mentor ModelSim, Synopsys VSS, and Active- 
VHDL. 

Built-in Verilog support allows you to simulate with the Cadence 
Verilog-XL and other compatible simulators. Xilinx HDL simulation 
supports all current Xilinx FPC.A and CPLD devices. Refer to the 
"Using VHDL/Verilog Libraries and Models" section for the list of 
supported VHDL and Verilog standards. 

Functional Simulation 

Functional simulation of HDL designs includes support for FPGA 
and CPLD architecture features, such as global set/reset, global 
tristate enable, on-chip oscillators, RAMs, and ROMs. 

You can perform functional simulation at the following points in the 
design flow. 

• RTL simulation, including instantiated UniSim library compo¬ 
nents, LogiBLOX modules, and LogiCORE models 

• Post-synthesis gate-level VHDL or Verilog netlist 

• Post-NC.DBuild gate-level VHDL or Verilog netlist fiom imple¬ 
mentation after NGDBuild using the SimPrim library 

• Post-map partial-timing functional simulation with netlist and 
SDF file from implementation after mapping, and before place 
and route, using the SimPrim library 
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Register Transfer Level (RTL) 

Tine first simulation point is the RTL-level simulation that allows you 
to verify or simulate a description at the system or chip level. At this 
level, the system or chip is described with high-level RTL language 
constructs. VHDL and Verilog simulators exercise the design to check 
its functionality before it is implemented in gates. A test bench is 
created to model the environment of the system or chip. At this level, 
the Unified Simulation Library (UniSim) is used to instantiate cells. 
You can also instantiate LogiBLOX components if you do not want to 
rely on the module generation capabilities of your synthesis tool, or if 
your design requires larger memory structures. 

Post-Synthesis (Pre-NGDBuild) Gate-Level 
Simulation 

After the RTL simulation is error-free, the system or chip is synthe¬ 
sized to gates. The test bench is used again to simulate the synthe¬ 
sized result and check its consistency with the original design 
description. 

Cate-level simulation in the Xilinx design flow includes any simula¬ 
tion performed after any of the synthesis, map, or place and mute 
stages. The post-synthesis pre-NGDBuild gate-level simulation is a 
functional simulation with unit delay timing. The gates are expressed 
in terms of UniSim components. 

Tills gate-level functional simulation allows you to directly verify 
your design after it has been generated by the synthesis tool. If there 
are differences in the behavior of the original RTL description and the 
synthesized design, this may indicate a problem with the synthesis 
tool. Although RTL-level and gate-level simulation may differ even 
when the synthesis is correct. This is due to the cliange in the level of 
abstraction. In general, overall functionality should be the same, but 
timing differences may occur. 

Post-synthesis simulation is synthesis vendor-dependent, and the 
synthesis tool must write VHDL or Verilog netlists in terms of 
UniSim library components. Check with your synthesis vendor for 
this feature. The library usage guidelines for RTL simulation also 
apply to post-synthesis pre-NGDBuild gate-level functional simula¬ 
tion. LogiBLOX models remain as behavioral blocks and can be simu- 


St/nl/ies/s irrrcf Simulation Design Guide 


5-5 




Synthesis and Simulation Design Guide 


Lilt'd in the same design as structural UniSim components. You may 
have to insert library statements into the HDL code. 

Post-NGDBuild (Pre-Map) Gate-Level Simulation 

Tine post-NGDBuild (pre-map) gate-level functional simulation is 
used when it is not possible to simulate the direct output of the 
synthesis tool. This occurs when the tool cannot write UniSim- 
compatible VHDL or Verilog netlists. In this case, NGDBuild trans¬ 
lates the ED1F or XNF output of synthesis to SimPrim library compo¬ 
nents. Like post-synthesis. pre-NGDBuikl simulation, this simulation 
allows you to verify that your design has been synthesized correctly, 
and you can begin to identify any differences due to the lower level of 
abstraction. Unlike the post-synthesis pie-NGDBuild simulation, 
there are GSR. GR (global reset). PRLD (preload), and GTS nets Hurt 
must be initialized, just as for post-map partial timing simulation. 

Different simulation libraries are used to support simulation before 
and after running NGDBuild. Prior to NGDBuild. your design is 
expressed as a UniSim netlist containing Unified library compo¬ 
nents. After NGDBuild, your design is a netlist containing SimPrims. 
Although these library changes are fairly transparent, two important 
considerations are specifying different simulation libraries for pre- 
and post-implementation simulation, and the different gate-level 
cells in pre- and post-implementation netlists. 

You can pause the implementation tools after reading in your source 
files; select either a VHDL or Verilog netlist as an output file; and 
write the file. 

Post-Map Partial Timing (CLB and IOB Block Delays) 

Tine third type of post-synthesis functional simulation is post-map 
partLnl timing simulation. This gate-level description is also 
expressed in terms of SimPrim components. 

Tine SDF file that is created contains timing numbers for the CLB and 
IOB mapped blocks. At this point, your design is not routed and does 
not contain net delays, except for pre-laid out macros, such as cores. 
Like the post-NGDBuild and full-timing simulation, there are GSR. 
GR. PRLD, and GTS nets that must be initialized. If you want partial 
timing delays, you can use the SDF file (this is optional). 
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You can pause the implementation tools alter the Map program stops, 
and write one ol the HDL formats. 

Creating a Simulation Test Bench 

When you create a test bench for functional simulation, refer to the 
following table for coding style examples. You should refer to your 
synthesis vendor's documentation before using these styles because 
many of the styles cannot be synthesized. 

Table 5-2 Common Coding Examples for Verilog/VHDL 


Description 

Verilog 

VHDL 

Delay or wait 20 
ns 

'tier* scale 1 ns/ 100 
♦ 20 

wait for 20 ns; 

Creation of a free 
running clock 

Initial 
begin 
clock - 0 ; 

♦25 forever *25 clock - 
'-clock; 
end 

Loop 

wait for 25 ns; 
clock <-not (clock); 
end loop; 

Print Text." to 

screen 

Sdisplay("Text"); 

report "Text." 

Print value of 
signal to screen 
whenever the 
value changes 

Snvonitor <%r", $real 
tire, "%b", 

clock, rr tb ,r my..signal) ; 


Apply a binary 
value 1010 to an 
input bus X. 


X <-"1010"; 

Creation of a for 

loop 0 to 10 

for (x- 0 ; x < 10 ; x-x*-l) 
begin 

OCtbnS 

end 

for x in 0 to 9 loop 
actions 
end loop; 

Write "X = value" 
to an output file 

Sdumpfile {"file 
nar>e.drp") ; 

Sdumpvars {X); 

variable TEMP; 
write (TEMP, "X -") ; 
write (TEMP, X); 
writeline (filenarr>e, 

TEMP); 
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Table 5-2 Common Coding Examples for Verilog/VHDL 


Description 

Verilog 

VHDL 

Wait until X Ls 
logic one 


wait (X — '1'); 

Wait until X tran¬ 
sitions to a logic 
one 

8(posedge X); 

on X; 

If-Else construct 

always $ (X) 
begin 
if (X - 1) 

V - I'bO; 
else 

Y - I'bO; 
end 

process (X) 
if (X - '1') then 

Y - 'O' ; 

else 

Y - '1' ; 
end if; 

end process; 

Case construct 

always $<X or A) 

Case (X) 

2'b00 : Y - I'bO; 

2'b01 : Y - 1'bl; 
default : Y - A; 
endcase 

process (X,A) 
begin 
case X is 

when "00" -> Y -'O'; 
when "01" -> Y - '1'; 
when others -> Y - A; 
end case; 
end process; 

Example instanti¬ 
ation of an OFD 

OFD U1 

f.O(D_<XJT) , .D(D_IN) , 

.0 (CLOCK) ) ; 

Ul: OFD port map 

(Q -> D_OUT), D -> d_:n, 

C -> C1XXTK); 


Timing Simulation 

Timing simulation is important in verifying the operation of your 
circuit after the worst-case placed and routed delays ate calculated 
for your design. In many cases, you can use the same test bench that 
you used for functional simulation to perform a more accurate simu¬ 
lation with less effort. You can compare the results from the two 
simulations to verify that your design is performing as initially speci¬ 
fied. The Xilinx tools create a VHDL or Verilog simulation netiist of 
your placed and routed design, and provide libraries that work with 
many common HDL simulators. 
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Post-Route Full Timing (Block and Net Delays) 

Alter your design is routed using PAR, it can be simulated with the 
actual block and net timing delays with the same test bench used in 
the behavioral simulation. 

Tine back-annotation process (NGDAnno) produces a netlist of 
Sim Prims annotated with an SDF file with the appropriate block and 
net delay data from the place and route process. This netlist has GSR. 
GR. PRLD, and GTS nets that must be initialized. For more informa¬ 
tion, refer to the "Simulating Global Signals" section. 

Creating a Timing Simulation Netlist 

You can create a timing simulation netlist from the Design Manager 
or from the command line, as described in this section. 

From the Design Manager 

1. Select Setup -> Options in the Flow Engine. 

Tire Options dialog box appears. 

2. Select the Produce Timing Simulation Data button in the 
Optional Targets field. 

3. Select the Edit Template button next to the Implementation dtop- 
down list in the Program Options Templates field. 

Tire Implementation Template dialog box appears. 

4. Select the Interface tab. 

5. In the Simulation Data Options field, select applicable options as 
follows. 

• Fomrat 

Specify the netlist format to use for simulation. The format is 
usually VHDL or Verilog for synthesis designs. 

• Correlate Simulation Data to Input Design 

This option enables signal back annotation to the original 
compiled netlist. By default, this option is on. However, you 
can turn if off to decrease run lime, or if there are problems 
with the back-annotated simulation data. Port names are not 
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changed by ihis option; however, internal node names may 
change. 

• Bring Out Global Set/Reset Net as a Port 

Tills option creates an external port in the simulation netlist 
to allow control of the power-on reset from a port. This 
option is not necessary for most simulators, and is off by 
default. 

6. Click OK in the Implementation Template dialog box. 

7. Click OK in the Options dialog box. 

8. When you implement your design, the Flow Engine produces 
timing simulation data files. 

Note: If you are using the Verilog-XL simulator, you may want to use 
the -ul switch for the NGD2VER program to automatically add the 
uselib directive to the simulation netlist to point to the location of the 
simulation libraries. Use the Template Manager to set this switch. 
Refer to the Design Manager/Flow Engine Guide or http:// 
www.xilinx.com/techdocs/3167.htm for more information. 

From the Command Line 

Note: To display the available options for the programs in this 
section, enter the program executable name at the prompt without 
any arguments. For complete descriptions of these options, refer to 
the Development System Reference Guide. 

1. Run N'GDAnno on your placed and routed .ncd file. 

For back-annotated output (signals correlated to original netlist), 
enter the following. 

ngdanno -p design, pc C design .ncd design .ngm 

For output that is not back-annotated (faster run time), enter the 
following. 

ngdanno design .ncd 

2. Run the i\*GD2XXX program for the particular netlist you want 
to create. 

For VHDL, enter the following. 

ngd 2 vhdl [options] design .nga 
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For Verilog, enter the following. 
ngd 2 ver (options] design .nga 

Using VHDL/Verilog Libraries and Models 

The five simulation points listed previously require the UniSim, 
SimPrim, XDVV (Xilinx DesignWare), and LogiBLOX libraries. Tlie 
first point. RTL simulation, is a behavioral description of your design 
at the register transfer level. RTL simulation is not architecture- 
specific unless your design contains instantiated UniSim or Logi¬ 
BLOX components. To support these instantiations. Xilinx provides a 
functional UniSim library and a behavioral LogiBLOX library. 

The second point, post-synthesis (pre-NCDBuild) gate-level simula¬ 
tion uses the UniSim and XDVV libraries. The third, fourth, and fifth 
points (post-NGDBuild. post-map. and post-route} use the SimPrim 
library. The following table indicates what library is required for each 
of the five simulation points. 

Table 5-3 Simulation Phase Library Information 


Simulation Point 

Compilation Order of Library Required 

RTL 

UniSim 


LogiBLOX 

Post-Synthesis 

UniSim (Device dependent) 

Post-NGDBuild 

SimPrim 

Post-MAP 

SimPrim 

Post-Route 

SimPrim 


Adhering to Industry Standards 


The standards in the following table are supported by the Xilinx 
simulation flow. 

Table 5-4 Standards Supported by Xilinx Simulation Row 


Description 

Version 

VHDL Language 

IEEE-STD-1076-87 

Yen log Language 

IEEE-STD-1364-95 

VITAL Modeling Standard 

IEEE-STD-1D76.-1-95 


Synthesis and Simulation Design Guide 


5-11 


























Synthesis and Simulation Design Guide 


Table 5-4 Standards Supported by Xilinx Simulation Row 


Description 

Version 

Standard Delay Format fSDF) 

2.1 

Sid logic Data Type 

IEEF.-STD-1164-93 


Tine UniSim and SimPrim libraries adhere to IEEE-STDs. The VHDL 
library uses the VITAL 1 EE E-STD-1076.4 standard, and the Verilog 
library uses the 1EEE-STD-1364 standard. By following these stan¬ 
dards. Xilinx provides support for customers who use HDL tools 
from various vendors. 

VHDL Initiative Towards ASIC Libraries (VITAL) was created to 
promote the standardization and interchangeability of VHDL 
libraries and simulators from various vendors. It also defines a stan¬ 
dard for timing back-annotation to VHDL simulators. 

Most simulator vendors have agreed to use the 1EEE-STD 1076.4 
VITAL standard for the acceleration of gate-level simulations. Check 
with your simulator vendor to confirm that this standard is being 
followed, and to verify proper settings and VHDL packages for this 
standard. The simulator may also accelerate IEEE-STD-1164, the stan¬ 
dard logic package for types. 

VITAL libraries include some overhead for timing checks and back- 
annotation styles. The UniSim Library turns these liming checks off 
for unit delay functional simulation. The SimPrim back-annotation 
library keeps these checks on by default; however, you or your 
system administrator can turn them off. You must edit and re¬ 
compile the SimPrim components file after setting the generics. 
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Locating Library Source Files 

Tine following table provides information on the location of the simu¬ 
lation library source files, as well as the order for a typical compila¬ 
tion. 

Table 5-5 Simulation Library Source Files 


Library 

Location ot Source 
Files 

Required Libraries 

Verilog 

VITAL 

VHDL 

Verilog 

VITAL 

VHDL 

UniSim 4K 
Family, 
Spartan 
(use 

UNI4000E 
for Spartan) 

SXIL1NX/ 

verilog/ 

sre/ 

unisims 

SXIUNX/ 

vhdl/src/ 

unisims 

Not required for 
Verilog-XL, 
see vendor docu¬ 
mentation for other 
simulators 

Required; 

typical compilation order; 
unisim VCOMP.vhd 
unisim VPKG.vhd 
unisim VITAL.vhd 
unisint_VCFG4K.vkd 
(optional) 

UniSim 

52K Family 

SX1L1NX/ 

verilog/ 

sre/ 

uni5200 

SXIUNX/ 

vhdl/src/ 

unisims 

Not required for 
Verilog-XL. 
see vendor docu¬ 
mentation for other 
simulators 

Required; 

typical compilation order; 
unisim VCOMP52K.vhd 
unisim VITAL.vhd 
unisim VlTAL52K.vhd 
unisim VCFG52K.vhd 

LogiBLOX 

(Device 

Indepen¬ 

dent) 

None; 

uses 

Sim Prim 
library 

SXIUNX/ 

vhdl/src/ 

logiblox 

None; uses 

Sim Prim library 

Required; 

typical compilation order; 
mvlutil.vhd 
mvlarith.vhd 
logiblox.vhd 

SimPrim 

(Device 

Indepen¬ 

dent) 

$XIUNX/ 

verilog/ 

sre/ 

simprims 

SXIUNX/ 

vhdl/src/ 

simprims 

Not required for 
Verilog-XL. 
see vendor docu¬ 
mentation for other 
simulators 

Required; 

typical compilation order, 
simprim Vcomponents.vhd 
simprim Vpackage.vhd 
simprim VlTAL.vhd 


Using the UniSim Library 

Tile UniSim Library, used for functional simulation only, contains 
default unit delays. This library includes all the Xilinx Unified 
Library components that are inferred by most popular synthesis 
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tools. In addition, the UniSim Library includes components that are 
commonly instantiated, such as lOs and memory cells. You should 
use your synthesis tool's module generators (such as LogiBLOX) to 
generate higher order functions, such as adders, counters, and large 
RAMs. 

UniSim Library Structure 

Tine UniSim library directory structure is different for VHDL and 
Verilog. There is only one VHDL library for all Xilinx technologies 
because the implementation differences between architectures are not 
important for unit delay functional simulation except in a feiv cases 
when* functional differences occur. 

For example, the decodes in XC4(XX) devices has a pull-up, and in 
XC5200 devices, it does not. In these few cases, configuration state¬ 
ments are used to choose between architectures for the components. 
One library makes it easy to switch between technologies. It is left up 
to the user and the synthesis tool to use technology-appropriate cells. 
For Verilog, separate libraries are provided for each technology 
because Verilog does not have a configuration statement. 

Schematic macros are not provided because mcxst schematic vendors 
provide the lower-level netlist for importing into a synthesis tool. 
Some synthesis vendors have these macros in their libraries, and can 
expand them to gates. You can use the HDL output from synthesis to 
simulate these macros. You can also use a post-NGDBuild or post- 
Map netlist to simulate netlists with embedded schematic macros. 
Tills lower-level netlist for a schematic macro is also required for 
implementation. The VHDL models for Synopsys DesignWare 
components are in the Xilinx Synopsys Interface at SXILINX/ 
synopsys/libraries/sim/src/xdw. Because Verilog versions of the 
DesignWaie components are not currently available, use post- 
NGDBuild functional simulation instead. 

• VHDL UniSim Library 

The VHDL version of the UniSim library is VITAL-compliant, 
and can be accelerated, however, it is not back-annotated with an 
SDF file. The files are in SXlLINX/vhdl/src/unisims. The source 
file should be compiled into a library named UNISIM. 

• Verilog UniSim Library 
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The Verilog version of fhe UniSim library may not have to be 
compiled, depending on the Verilog tool. Because there are a few 
cells with functional differences between Xilinx devices, a sepa¬ 
rate library is provided for each supported device. For example, 
decoders contain pull-ups in some devices and not in others. The 
libraries are in uppercase only. The libraries are located at 
iXILINX/verilog/src/fec/mofogK where technology/ is UNI3000, 
unisims, UN15000, or UN19000. 

Note: Verilog reserves the names buf, pullup, and pulldown; the 
Xilinx versions are changed to buff, pullupl, pullup2, or pulldown2, 
and then mapped to the proper cell during implementation. 

Compiling the UniSim Library 

Tire UniSim VHDL library (or Verilog library) can be compiled to any 
physical location. The VHDL source files are found in SX1L1NX/ 
vhdl/src/unisims and are listed here in the order in which they must 
be compiled. 

1. unisim VCOMP.vhd (component declaration file) 

2. unisim VCOMP52K.vhd (substitutional component declaration 
file for XC5200 designs) 

3. unisim VPKG.vhd (package file) 

4. unisim VITAL.vhd (model file) 

5. unisim VITAL52K.vhd (additional model file for XC5200 
designs) 

6. unisim VCFG4K.vhd (configuration file for XC4K edge 
decoders) 

7. unisim VCFG52K.vhd (configuration file for XC5200 internal 
decoders) 

Note: To use both IK and S2K, compile them into separate directories 
as a UniSim library. Change the mapping of the UniSim logical name 
to the appropriate directory for each design. 

Tire uppercase Verilog components are found in individual compo¬ 
nent files in the following directories. 

1. $XlLLNX/verilog/src/uni3000 (Series 3K) 
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2. $XlLINX/veriIog/src/unLsims (Series 4KE, 4KX, 4KL. 4KXV, 
Spartan. Virtex) 

3. $XlLINX/verilog/src/uni52(X) (Series 5200) 

4. $XILINX/verUog/src/uni9000 (Series 9500) 

Instantiating UniSim Library Components 

You can instantiate UniSim library components in your design and 
simulate them during RTL simulation. Your VHDLcode must refer to 
the UniSim library compiled by you or by your system administrator. 
Tire VHDL simulation tool must map the logical library to the phys¬ 
ical location of the compiled library. VHDL component declarations 
aiv provided in the library and do not need to be repeated in your 
code. Verilog must also map to the UniSim Verilog library. 

Using the LogiBLOX Library 

LogiBLOX is a module generator used for schematic-based design 
entry of modules such as adders, counters, and large memory blocks. 
In the HDL flow, you can use LogiBLOX to generate large blocks of 
memory for instantiation. Refer to the LogiBLOX Guide for more 
information. 

In addition to the RTL code that results in synthesized logic, you can 
generate modules such as counters, adders, and large memory arrays 
with LogiBLOX. You can enter the desired parameters into LogiBLOX 
and select a VHDL model as output. The VHDL model is at the 
behavioral level because it allows for quicker simulation times. Logi¬ 
BLOX is primarily useful for building large memory arrays that 
cannot be inferred. VHDL models are provided for LogiBLOX 
modules from the schematic environment, or for large memory 
arrays. Most LogiBLOX modules contain registers and require the 
global set/reset (GSR) initialization. Since the modules do not contain 
output buffers going off-chip, the global tristate enable (GTS) initial¬ 
ization does not apply. 

LogiBLOX models begin as behavioral in VHDL, but are mapped to 
SimPrim structural models in the back-annotated netlist. The behav¬ 
ioral model is also used for any post-synthesis simulation because the 
module is processed as a "bLrck box" during synthesis. It is important 
that the initialization behavior is consistent for the behavioral model 
used for RTL and post-synthesis simulation and for the structural 
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model used after implementation. In addition, the initialization 
behavior must work with the method used for synthesized logic and 
cores. 

Note: For Verilog. the LogiBLOX model is a structural netlist of 
Sim Prim models. Do not synthesize this netlist; it is for functional 
simulation only. 

Compiling the LogiBLOX Library 

Tine LogiBLOX library is not a library of modules. It is a set of pack¬ 
ages required by the LogiBLOX models that are created "on-the-fly" 
by the LogiBLOX tool. 

You can compile the LogiBLOX VHDL library (or Verilog) to any 
specified physical location. The VHDL source files are in SX1L1NX/ 
vhdl/sre/logiblox, and are listed below in the order in which they 
must be compiled. 

1 . mvlutil.vhd 

2 . mvlarith.vhd 

3. logiblox.vhd 

Tine Verilog source files are in $X!UNX/verilog/src/logiblox. 

Instantiating LogiBLOX Modules 

LogiBLOX components are simulated with behavioral code. They are 
not intended to be synthesized, but they can be simulated. The 
synthesizer processes the components as a "black box". Implementa¬ 
tion uses the NGO file created by LogiBLOX. The source libraries for 
LogiBLOX packages are in SXlLINX/vhdl/src/logiblox and 
SX1L1NX/verilog/src/logiblox. Tine actual models are output from 
the LogiBLOX tool. The package files must be compiled into a library 
named logiblox. Tine component model from the LogiBLOX GUI 
should be compiled into your working directory with your design. 

Using the LogiCORE Library 

In addition to synthesized or generated logic, you can use high-level 
pre-designed LogiCORF. models. These models are high-level VHDL 
behavioral or RTL models that are mapped toSimPrinn structural 
models in the back-annotated netlist. The behavioral model is used 
for any post-synthesis simulation because synthesis processes the 
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core as a "black box". As with LogiBLOX models, Ihe initialization 
behavior must be consistent for the behavioral model used for RTL 
and post-synthesis simulation and for the structural model used after 
implementation. In addition, the initialization behavior must work 
with the method used for synthesized logic and LogiBLOX modules. 

Tire UniSim VHDL and Verilog libraries can emulate the global set/ 
reset and global tristate network in Xilinx FPGAs. VHDL uses special 
components for driving the local reset and tristate enable lines, and 
sending implementation directives to move the nets to the global 
signal resources. Verilog uses a macro definition. 

Tire local signals emulate the fully routed global signals in a post- 
routed netlist. Birth Ihe VHDL and Verilog post-route netlists use the 
SimPrim Library and have global reset and output tristate enable 
signals fully routed; they are not emulated. 

LogiBLOX and LogiCORE models are at a behavioral level and do 
not use library components for global signals. However the Logi¬ 
BLOX model does require the packages that are compiled into the 
LogiBLOX library. 

Simulating Global Signals 

Xilinx PLDs have register (flip-flops and latches) set/reset circuitry 
that pulses at the end of the configuration mode and after power-up. 
This pulse is automatic and does not need lobe programmed. All the 
flip-flops and latches in a PLD receive this pulse through a dedicated 
global set/reset (GSR), PRLD. or reset (GR) net. The registers either 
set or reset, depending on how the registers are defined. 

In addition to the set/reset pulse, all output buffers are trLstated 
during configuration mode and after power-up with the dedicated 
global output tristate enable (GTS) net. The global tri-state and reset 
net names are provided in the following table. 

Table 5-6 Global Reset and Tristate Names for Xilinx Devices 


Device 

Family 

Global Reset 
Name 

Global Tristate 
Name 

Default Reset 
Polarity 

XC3000 

GR 

Not Available 

Low 

XC4000 

GSR 

GTS 

High 

XC5200 

GR 

GTS 

High 
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Table 5-6 Global Reset and Tristate Names for Xilinx Devices 


Device 

Family 

Global Reset 
Name 

Global Tristate 
Name 

Default Reset 
Polarity 

XC9500 

PRLD 

GTS 

High 

SPARTAN 

GSR 

GTS 

High 


These PRLD, GSR. and GR nets require special handling during 
synthesis, simulation, and implementation to prevent them from 
being assigned to normally routed nets, which uses valuable routing 
resources and degrades design performance. The GSR. PRLD, or GR 
net receives a rtxet-on-configuration pulse from the initialization 
controller, as shown in the following figure. 



Figure 5-2 Built-in FPGA Initialization Circuitry 

This pulse occurs during the configuration or power-up mode of the 
PLD. However, for ease of simulation, it is usually inserted at time 
zero of the test bench, before logical simulation is initiated. The pulse 
width is device-dependent and can vary widely, depending on 
process voltage and temperature changes. The puLse is guaranteed to 
be long enough to overcome all net delays on the reset special- 
purpose net. The parameter for the puLse width is TPOR. as described 
in Tlte Programmable Logic Data Book. 

The lristate-on-configuralion circuit shown in the ''Built-in FPGA 
Initialization Circuitry ** figure also occurs during the configuration or 
power-up mode of the PLD. Just as for the reset-on-configuration 
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simulation, it is usually inserted at time zero of the test bench before 
logical simulation is initiated. The pulse drives all outputs to the 
tristate condition they are in during the configuration of the PLD. All 
general-purpose outputs are affected whether they are regular, 
tristate, or bi-direct outputs during normal operation. This ensures 
that the outputs do not erroneously drive other devices as the PLD is 
being configured. The puLse width is device-dependent and can vary 
widely with process and temperature changes. The pulse is guaran¬ 
teed to be long enough to overcome all net delays on the GTS net. The 
generating circuitry is separate from the reset-on-configuration 
circuit. Tire pulse width parameter is T ra(f , as described in Tlte 
Programmable Logic Data Book. Simulation models use this pulse width 
parameter for determining HDL simulation for global reset and tri¬ 
state circuitry (initially developed for schematic design.) 

Adapting Schematic Global Signal Methodology for 
VHDL 

Tliere are no global set/reset or output tristate enable pins on the 
simulation, synthesis, or implementation models of the register 
components in schematic-based designs. During synthesis, both the 
global and local reset and tristate-state enable signals are connected 
to the local pin. Schematic simulators can simulate global signals 
without a pin. The global signals are represented as internal signals in 
the schematic simulation model and the test vectors drive the internal 
global signals directly. If you want complete control of initialization, 
use registers with asynchronous set/reset to emulate the GSR, even if 
local set/reset is not required. Synchronous set/reset registers will 
initialize on their own at time zero. They can be synchronously set 
after that but cannot emulate GSR behavior after time zero. Some 
memory components without asynchronous clears will exhibit 
similar behavior. 

In VHDL designs, you must declare as ports any signals that are stim¬ 
ulated or monitored from outside a module. Global GSR and GTS 
signals are used to initialize the simulation and require access ports if 
controlled from the test bench. However, the addition of these ports 
makes the pre- and post-implementation versions of your design 
different, and your original test bench is no longer applicable to both 
versions of your design. Since the port lists for the two versions of 
your design are different, the socket in the test bench matches only one 
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of them. To address this issue, five new cells are provided for VHDL 
simulation: ROC, ROCBUF, TOC, TOCBUF, and STARTBUF. 

Verilog can simulate internal signals, and these signals am driven 
directly from the test bench. However, interpretive \brilog (such as 
Verilog-XL) and compiled Verilog (such as MTI or NC- Verilog) 
require a different approach for handling the libraries. 

You do not need to incorporate any ports into schematic designs for 
simulators to mimic the device's global reset (GSR) or global tristate 
(GTS) networks. Schematic simulators specify these signals on the 
register model as ‘global’ to indicate to the simulator that these 
signals are all connected. These signals are not part of the cell's pin 
list, do not appear in the netlist, and are not implemented in the 
resulting design. These global signals are mapped into the equivalent 
signals in the back-end simulation model. Using this methodology 
with schematic designs, you can fully simulate the silicon's built-in 
global networks and implement your design without causing conges¬ 
tion of the general-purpose routing resources and degrading the 
clock speed. 

Setting VHDL Global Set/Reset Emulation in 
Functional Simulation 

When using the VHDL UniSim library, it is important to control the 
global signals for reset and output tristate enable. If do not control 
these signals, your timing simulation results will not match your 
functional simulation results because the initialization differs. 

VHDL simulation does not support test bench driven internal global 
signals. If the test bench drives the global signal, a port is required. 
Otherwise, the global net must be driven by a component within the 
architecture. 

Also, the register components do not have pins for the global signals 
because you do not want to wire to these special pre-laid nets. 
Instead, you want implementation to use the dedicated network on 
the chip. 

For the HDL synthesis flow, the global reset and output tristate 
enable signals are emulated with the local reset and tristate enable 
signals. Special implementation directives are put on the nets to move 
them to the special pre-routed nets for global signals. 
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Tine V'HDL UniSim library uses special components to drive the local 
reset and tristate enable signals. These components use the local 
signal connections to emulate the global signal, and also provide the 
implementation directives to ensure that the pre-routed wires are 
used. 

You can instantiate these special components in the RTL description 
to ensure that all functional simulations match the timing simulation 
with respect to global signal initializations. 

Global Signal Considerations (VHDL) 

Tine following are important to VHDL simulation, synthesis, and 
implementation of global signals in FPGAs. 

• Tine global signals have automatically generated pulses that 
always occur even if the behavior Ls not described in the front- 
end description. The back-annotated netlist has these global 
signals, to match the silicon, even if the source design does not. 

• Tine simulation and synthesis models for registers (flip-flops and 
latches) and output buffers do not contain pins for the global 
signals. Tlnis is necessary to maintain compatibility with sche¬ 
matic libraries that do not require the pin to model the global 
signal behavior. 

• V'HDL does not have a standardized method for handling global 
signals that is acceptable within a VTTAL-compatible library. 

• LogiBLOX generates modules that are represented as behavioral 
models and require a different way to handle the global signal, 
yet still maintain compatibility with the method used for general 
user-defined logic and LogiCOREs. 

• Intellectual property cores from the LogiCORE product line are 
represented as behavioral, RTL, or structural models and require 
a different way to handle the global signal, yet still maintain 
compatibility with the method used for general user-defined 
logic and LogiBLOX. 

• The design is represented at different levels of abstraction during 
the pre- and post-synthesis and implementation phases of the 
design process. Tine solutions work for all three levels and give 
consistent results. 
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• Tine place and route tools must be given special directives to 
identify the global signals in order to use the built-in circuitry 
instead of the general-purpose logic. 

GSR Network Design Cases 

When defining a methodology to control a device's global set/reset 
(GSR) network, you should consider the following three general 
cases. 

Table 5-7 GSR Design Cases 


Name 

Description 

Case 1 

Reset-On-Configuration pulse only; no user control of 
GSR 

Case 1A 

Simulation model ROC initializes sequential elements 

Case IB 

User initializes sequential elements with ROCBUF 
model and simulation vectors 

Case 2 

User control of GSR after Power-on-Reset 

Case 2A 

External Port driving GSR 

Case 2B 

Internal signal driving GSR 

Case 3 

Don't Care 


Note: Reset-on-Configuration for PLDs is similar to Power-on-Reset 
for ASICs except it occurs at power-up and during configuration of 
the PLD. 

Case 1 is defined as follows. 

• Automatic pulse generation of the Reset-On-Configuration signal 

• No control of GSR through a test bench 

• Involves initialization of the sequential elements in a design 
during power-on, or initialization during configuration of the 
device 

• Need to define the initial states of a design's sequential elements, 
and have these states reflected in the implemented and simulated 
design 

• Two sub-cases 

• In Case 1 A, you do not provide the simulation with an initial¬ 
ization pulse. The simulation model provides its own mecha- 
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nism for initializing its sequential elements (such as the real 
device does when power is first applied). 

• In Case IB. you can control the initializing power-on reset 
pulse from a test bench without a global reset pin on the 
FPCA. This case is applicable when system-level issues make 
your design's initialization synchronous to an off-chip event. 
In this case, you provide a pulse that initializes your design 
at the start of simulation time, and possibly provide further 
pulses as simulation time progresses (peihaps to simulate 
cycling power to the device). Although you are providing the 
reset pulse to the simulation model, this pulse is not required 
for the implemented device. A reset port is not required on 
the implemented device, however, a reset port is required in 
the behavioral code through which your reset pulse can be 
applied with test vectors during simulation. 

Using VHDL Reset-On-Configuration (ROC) Cell 
(Case 1A) 

For Case 1A, the ROC (Reset-On-Configuration) instantiated compo¬ 
nent model is used. This model creates a one-shot pulse for the global 
set/reset signal. The pulse width is a generic and can be configured to 
match the device and conditions specified. The ROC cell is in the 
post-routed netlist and, with the same pulse width, it mimics the pre¬ 
route global set/reset net. The following is an example of an ROC 
cell. 

Note: The TPOR parameter from The Programmable Logic Data Book Ls 
used as the WIDTH parameter. 
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library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.st d_ leg ic__un signed.all; 
library UNISIM; 
use UNXSIM.all; 
entity EX_ROC is 

port {CLOCK, ENABLE : in std_logic; 

CUP, CDOWN : out 5td_logic_vcctor (3 downto 0)); 
end EX_ROC; 

architecture A of EX_ROC is 
signal GSR : std_logic; 

signal COUNT_UP, COUNT_DOWN : std_loqic_vector (3 downto 0); 
component RCC 

port (O : out std_logic); 
end component; 
begin 

U1 : RCC port map IO => GSR); 

UP_COUNTER : process |CLCCK r ENABLE, GSR) 
begin 

if (GSR - ' 1* ) then 

COUNT_UP <= "0000"; 

clsif (CLOCK*event AND CLOCK = '1*) then 
if (ENABLE = 'l'» then 

COUNT_UP <= COUNT_UP * "0001"; 
end if; 
end if; 

end process UP_CCUNTER; 

DC«N_CCUNTER : process |CLOCK, ENABLE, GSR, COUNT_COWM) 
begin 

if (GSR = r 1' OR COUNT_ODKN = "0101") then 
COUNT_COKN <= "1111"; 
clsif {CLOCK* event AND CLCCK = '1*) then 
if (ENABLE = ' 1' ) then 

COUNT_DCWJ <= COUNT_DOKN - "0001"; 
end if; 
end if; 

end process OCWH_COUHTER; 

CUP <= COUNT_UP; 

CDOMN <= COUNT_DCWN; 
end A; 
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Using ROC Cell Implementation Model (Case 1A) 

Complementary to the previous VHDL model is an implementation 
model that guides the place and route tool to connect the net driven 
by the ROC cell to the special purpose net. 

This cell is created during back-annotation if you do not use the -gp 
or STARTUP block options. It can be instantiated in the front end to 
match functionality with GSR. GR. or PRLD (in both functional and 
timing simulation.) During back-annotation, the entity and architec¬ 
ture for the ROC cell is placed in your design's output VHDL file. In 
the front end, the entity and architecture are in the UniSim Library, 
requiring only a component instantiation. The ROC cell generates a 
one-time initial pulse to drive the GR, GSR. or PRLD net starting at 
time zero for a specified pulse width. You can set the pulse width 
with a generic in a configuration statement. The default value of the 
pulse width is 0 ns. This value disables the ROC cell and causes the 
global set/reset to be held low. (Active low resets are handled within 
the netlist itself and need to be inverted before using.) 

ROC Test Bench (Case 1 A) 

With the ROC cell you can simulate with the same test bench used in 
RTL simulation, and you can control the width of the global set/reset 
signal in your implemented design. ROC cells require a generic 
WIDTH value, usually specified with a configuration statement. 
Otherwise, a generic map is required as part of the component instan¬ 
tiation. You can set the generic with any generic mapping method. Set 
the width generic after consulting The Programmable L>gie Data Scot 
for the particular part and mode implemented. For example, an 
XC4000E part can vary from 10 ms to 130 ms. Use the TPOR param¬ 
eter in the Configuration Switching Characteristics tables for Master, 
Slave, and Peripheral modes. The following is the test bench for the 
ROC example. 

library IEEE; 

use IEEE.std_lcqic_1164.all; 

usc IEEE.std_lcqic_unsiancd.all; 

library UNISIM; 

use UNISIM.all; 

entity tcst_ofcx_roc is end test_ot"cxroc; 
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architecture inside of tcst_ofcx_roc is 
Component cx_rcc 

Port ( CLCCK, ENABLE: in STD_LOGI"; 

CUP , CD OWN: out 5TD_LCGIC_VECTOR <3 dovento 0> J ; 
End conponcnt; 


Begin 

UUT: ex_rcc port map(. . . 


End inside; 

The best method for mapping the generic is a configuration in your 
test bench, as shown in the following example. 

Configuration overall cf test_ofcxroc is 
For inside 

For UUT:cx_roc 
For A 

For Ul:ROC use entity UNISIK.ROC 

<ROC_Vj 

Generic map IWIDTH=>52 ns»; 

End for; 

End for; 

End for; 

End overall; 
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This configuration is for pre-NGDBuild simulation. A similar config¬ 
uration is used for post-NGDBuild simulation. The ROC. TOC. and 
OSC4 are mapped to the WORK library, and corresponding architec¬ 
ture names may be different. Review the .vhd file created by 
NGD2V1TDL for the current entity and architecture names for posl- 
NGDBuild simulation. 

ROC Model in Four Design Phases (Case 1 A) 

The following figure shows the progression of the ROC model and its 
interpretation in the four main design pluses. 


1 2 . Sytfhnusd 



X6M6 


Figure 5-3 ROC Simulation and Implementation 

• Behavioral Phase—In this phase / the behavioral or RTL descrip¬ 
tion registers are inferred from the coding style, and the ROC cell 
can be instantiated. If it is not instantiated, the sigiul is not driven 
during simulation or is driven within the architecture by code 
that cannot be synthesized. Some synthesizers infer the local 
resets that are best for the global signal and insert the ROC cell 
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automatically. When this occurs, instantiation may not be 
required unless RTL level simulation is needed. The synthesizer 
may allow you to select the reset line to drive the ROC cell. Xilinx 
recommends instantiation of the ROC cell during RTL coding 
because the global signal is easily identified. This also ensures 
that GSR behavior at the RTL level matches the behavior of the 
post-synthesis and implementation netlists. 

• Synthesized Phase—In this phase, inferred registeis are mapped 
to a technology and the ROC instantiation is either carried from 
the RTL or inserted by the synthesis tools. As a result, consistent 
global set/reset behavior is maintained between the RTL and 
synthesized structural descriptions during simulation. 

• Implemented Phase—During implementation, the ROC is 
removed from the logical description that is placed and routed as 
a pre-existing circuit on the chip. The ROC is removed by making 
the output of the ROC cell appear as an open circuit. Then the 
implementation tool can trim all the nets driven by the ROC to 
the local sets or resets of the registers, and the nets are not routed 
in general purpose routing. All set/resets for the registers are 
automatically assumed to be driven by the global set/reset net so 
data is not lost. 

• Back-annotated Phase—In this phase, the Xilinx VHDL netlist 
program assumes all registers are driven by the GSR net; replaces 
the ROC cell; and rewires it to the GSR nets in the back-annotated 
netlist. The GSR net is a fully wired net and the ROC cell is 
inserted to drive it. A similar VHDL configuration can be used to 
set the generic for the pulse width. 

Using VHDL ROCBUF Cell (Case IB) 

For Case IB, the ROCBUF (Reset-On-Configuration Buffer) instanti¬ 
ated component is used. This component creates a buffer for the 
global set/reset signal, and provides an input port on the buffer to 
drive the global set reset line. During the place and route process, this 
port is removed so it is not implemented on the chip. ROCBUF does 
not reappear in the post-routed netlist. Instead, you can select an 
implementation option to add a global set/reset port to the back- 
annotated netlist. A buffer is not necessary since the implementation 
directive is no longer required. 
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The following example illustrates how to use the ROCBUF in your 
designs. 

library IEEE; 

use IEEE.std_lcgic_1164 .all; 
use IEEE.std_legic_unsigned.all; 

library UNISIM; 
use UNISIM.all; 
entity EX_RCCBUF is 

pert (CLOCK, ENABLE, SRP : in std_logic; 

CUP, CCOWN : cut std_logic_vcctcr (3 dewnto 0)J; 
end EX_ROCBUF; 

architecture A of EX_RCCBUF is 
signal GSR : std_logic; 

signal CCUNT_UP, CCUNT_DC'WN : std_logic_vector (3 dewnto 0); 
component RCCBUF 

pert <1 : in std_logic; 

O : out std_logic); 
end component; 
begin 

Ul : RCCBUF port map II => SRP, O => GSR>; 

UP_COUMTER : process ICLCCK, ENABLE, GSR) 
begin 

if (GSR * 'l') then 

CO’JMT_UP <= "0000"; 

clsif (CLOCK*event AMD CLCCK = '1*) then 
if (ENABLE = M'| then 

COUNT_UP <= COUNT_UP - "0001"; 
end if; 
end if; 

end process UP_CCUN7ER; 

DC«N_CCUNTER : process ICLCCK, ENABLE, GSR, COUMT_EOWN) 
begin 

if (GSR = '1* OR COUNT_DOKN = *0101 ") then 
COUNT_COKN <= "1111"; 
clsif (CLOCK*event AMD CLCCK = '1*) then 
if (ENABLE = MM then 

COUNT_DCMJ <= COUNT_E>OKN - "0001"; 
end if; 
end if; 

end process DCWN_COUHTER; 

CUP <= CCUNT_UP; 

CCOKN <= COUNI_DC«N; 
end A; 


5-30 


Xi7tH.r Development System 




SwJiito/m# Your Design 


ROCBUF Model In Four Design Phases (Case IB) 

The following figure shows the progression of the ROCBUF model 
and its interpretation in the four main design phases. 






FDCfc FW*£ 



MS40 


Figure 5-4 ROCBUF Simulation and Implementation 

• Behavioral Phase—In this phase, the behavioral or KTL descrip¬ 
tion registers are inferred from the coding style, and the ROCBUF 
cell can be instantiated. If it is not instantiated, the signal is not 
driven during simulation, or it is driven within the architecture 
by code that cannot be synthesized. Use the ROCBUF cell instead 
of the ROC cell when vou want test bench control of GSR Simula- 
lion. Xilinx recommends instantiating the ROCBUF cell during 
RTL coding because the global signal is easily identified, and you 
are not relying on a synthesis tool feature that may not be avail¬ 
able if ported to another tool. This also ensures that GSR behavior 
at the RTL level matches the behavior of the post-synthesis and 
implementation netlists. 
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• Synthesized Phase—In this phase, inferred registers are mapped 
to a teclinology and the ROCBUF instantiation is either carried 
from the RT1- or inserted by the synthesis tools. As a result, 
consistent global set/reset behavior is maintained between the 
RTL and synthesized structural descriptions during simulation. 

• Implemented Phase—During implementation, the ROCBUF is 
removed from the logical description that is placed and routed as 
a pre-existing circuit on the chip. The ROCBUF is removed by 
making the input and the output of the ROCBUF cell appear as 
an open circuit. Then the implementation tool can trim the port 
that drives the ROCBUF input, as well as the nets driven by the 
ROCBUF output. As a result, nets are not routed in general 
purpose routing. All set/resels for the registers are automatically 
assumed to be driven by the global set/reset net so data is not 
lost. You can use a VHDL netlist tool option to add the port back. 

• Back-annotated Phase—In this phase, the Xilinx VHDL netlist 
program starts with all registers initialized by the GSR net, and it 
replaces the ROC cell it would normally insert with a port if the 
GSR port option is selected. The GSR net is a fully wired net 
driven by the added GSR port. A ROCBUF cell is not required 
because the port is sufficient for simulation, and implementation 
directives are not required 

Using VHDL STARTBUF Block (Case 2A and 2B) 

Tine STARTUP block is traditionally instantiated to identify theCR, 
PRLD, or GSR signals for implementation if the global reset on 
tristate is connected to a chip pin. However, this implementation 
directive component cannot be simulated, and causes warning 
messages from the simulator. However, you can use the STARTBUF 
cell instead, which can be simulated. STARTUP blocks are allowed if 
the warnings can be addressed or safely ignored. 

For Cases 2A and 2B. use the STARTBUF cell. This cell provides 
access to the input and output ports of the STARTUP cell that direct 
the implementation tool to use the global networks. The input and 
output port names differ from the names of the corresponding ports 
of the STARTUP cell. This was done for the following reasons. 

• To make the STARTBUF a model that can be simulated with 
inputs and outpuLs. The STARTUP cell hangs from the net it is 
connected to. 
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• To make one model lhal works lor all Xilinx technologies. The 
XC4000 and XC5200 families require different STARTUP cells 
because the XC5200 has a global reset (C.R) net and not a C.SR. 

Tine mapping to the architecture-specific STARTUP cell from the 
instantiation of the STARTBUF is done during implementation. The 
STARTBUF pins have the suffix "IN" (input port) or "OUT" (output 
port). Two additional output ports, GSROUT and GTSOUT, arc avail¬ 
able to drive a signal for clearing or setting a design's registers 
(GSROUT), or for tri-stating your design's I/Os (GTSOUT). 

Tine input ports, GSRIN and GTSIN, can be connected either directly 
or indirectly via combinational logic to input ports of your design. 
Your design's input ports appear as input pins in the implemented 
design. The design input port connected to the input port, GSRIN, is 
then referred to as the device reset port, and the design input port 
connected to the input port, GTSIN, is referred to as the device 
tristate port. The following table shows the correspondence of pins 
between STARTBUF and STARTUP. 

Table 5-8 STARTBUF,STARTUP Pin Descriptions 


STARTBUF Pin 
Name 

Connection 

Point 

XC4000 
STARTUP Pin 
Name 

XC5200 
STARTUP Pin 
Name 

Spartan 

GSR1N 

Global Set/ 

Reset Port of 
Design 

GSR 

C.R 

GSR 

GTSIN 

Global Tristate 
Port of Design 

GTS 

GTS 

GTS 

GSROUT 

All Registers 
Asynchronous 
Set/Reset 

Not Available 

For Simulation 
Only 

Not Available 

For Simulation 
Only 

Not Available 

For Simulation 
Only 

GTSOUT 

All Output 
Buffers Tristate 
Control 

Not Available 
For Simulation 
Only 

Not Available 

For Simulation 
Only 

N/A 

CLKIN 

Port or INtemal 
Logic 

CLK 

CLK 

CLK 

Q20UT 

Port Or Internal 
Logic 

Q2 

Q2 

Q2 
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Table 5-8 STARTBUF,STARTUP Pin Descriptions 


STARTBUF Pin 
Name 

Connection 

Point 

XC4000 
STARTUP Pin 
Name 

XC5200 
STARTUP Pin 
Name 

Spartan 

Q30UT 

Pott Or Interna! 
Logic 

Q3 

Q3 

Q3 

OUT 

Pott Ot Interna! 
Logic 

5 

5 

QIQ4 

Q1Q4 

DONEINOUT 

Port Ot Internal 
Logic 

DONEIN 

DONEIN 

DONEIN 


Note: Using STARTBUF indicates that you want to access the global 
set/reset and/or instate pre-routed networks available in your 
design's target device. As a result, you must provide the stimulus for 
emulating the automatic pulse as well as the user-defined set/reset. 
This allows you complete control of the reset network from the test 
bench. 

Tlie following example shows how to use the STARTBUF cell. 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std.lcgic.unsigncd.all; 
library UNISIM; 
use UNISIM.all; 
entity EX.START3UF as 

port {CLOCK, ENABLE, DRP, DTP : in std.logic; 

CUP, CDCWN : out std_logic.vcctor <3 dewnto 0) »; 
end EX.STAR7BUF; 

architecture A of EX.STARTBUF is 

signal GSR, GSRIN.NET, GROUND, GTS : std.lcgic; 

signal CCUNT.UP, CCUNT.DCMJ : std_logic_vcctor (3 downto 0); 

component STARTBUF 

port (GSRIN, GTSIN, CLKIN : in std.logic; G3RCOT, GTSOUT, 
DCNEINCUT, C1Q40UT, Q20UT, QlCUT : out std.logic); 
end component; 
begin 

GROUND <= ' 0' ; 

GSRIN.NET <= NOT DRP; 

Ul : STARTBUF port map (GSRIN => GSRIN.NET, GTSIN -> DTP, 

CLKIN => GRCXJND, GSROUT => GSR, GTSOUT => GTS); 

UP.COUNTER : process ICLCCK, ENABLE, GSR) 
begin 
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if (GSR * 'l'> then 

COUNT_UP <= *0<0Q"; 

clsif <CLOCK*event AND CLCCK = r 1* ) then 
if (ENABLE = ' l' ) then 

COUNT_UP <= COUNT_UP - "0001"; 
end if; 
end if; 

end process UP_CCONTER; 

DC«N_CCX;NTER : process (CLCCK. ENABLE, GSR, COUNTDOWN) 

begin 

if (GSR - *1* OR COUNT_DOVfN - "0101") then 
COUNT_COKN <= "1111"; 
clsif (CLOCK*event AND CLCCK = '1*) then 
if (ENABLE = MM then 

COUNT_DC7rfN <= COUNT_DOKN - "0001"; 
end if; 
end if; 

end process DCWN_COUNTER; 

CUP <= CCUNT_UP when (GTS = 'O' AND COUNT_UP /= "0000") else "ZZZZ"; 

CDOHN <= COUNT_DC«N when (GTS = ' 0* ) else "ZZZZ"; 
end A; 


GTS Network Design Cases 

I us I as lor the global set/reset net there are three cases lor using your 
device's output Instate enable (GTS) network, as shown in the 
following table. 

Table 5-9 GTS Design Cases 


Name 

Description 

Case A 
Case A1 

Case A2 

Tristate-On-Configuration only; no user control of GTS 
Simulation Model TOC Tristates output buffers during 
configuration or power-up 

User initializes sequential elements with TOCBUF 
model and simulation vectors 

Case B 

Case B1 
Case B2 

User control of GTS after Tristate-On-Configuration 
External PORT driving GTS 

Internal signal driving GTS 

CaseC 

Don't Care 


Case A is defined as follows. 
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• Instating of output buffers during power-on or configuration of 
the device 

• Output buffers are Instated and reflected in the implemented and 
simulated design 

• Txvo sub-cases 

• In Case Al, you do not provide the simulation with an initial¬ 
ization pulse. Tine simulation model provides its own mecha¬ 
nism for initializing its sequential elements (such as the real 
device does when power is first applied). 

• In Case A2, you can control the initializing Tristate-On- 
Configuration pulse. This case is applicable when system- 
level issues make your design's configuration synchronous 
with an off-chip event. In this case, you provide a pulse to 
tristate the output buffers at the start of simulation time, and 
possibly provide further pulses as simulation time progresses 
(perhaps to simulate cycling power to the device). Although 
you are providing the Tristate-On-Configuration pulse to the 
simulation model, this pulse is not required for the imple¬ 
mented device. A Tristate-On-Configuration port is not 
required on the implemented device, however, a TOC port is 
required in the behavioral code through which your TOC 
pulse can be applied with test vectors during simulation. 

Using VHDL Tristate-On-Configuration (TOC) 

The TOC cell Ls created if you do not use the -tp or STARTUP block 
options. The entity and architecture for the TOC cell is placed in the 
design's output VHDL file. The TOC cell generates a one-time initial 
pulse to drive the GR. GSR, or PRLD net starting at time '0' for a user- 
defined pulse width. The pulse width can be set with a generic. Tire 
default WIDTH value is 0 ns, which disables the TOC cell and holds 
the tristate enable low. (Active low tristate enables are handled within 
the netlist itself, you must invert this signal before using it.) 

Tire TOC cell enables you to simulate with the same test bench as in 
the RTL simulation, and also allows you to control the width of the 
tristate enable signal in your implemented design. 

Tire TOC components require a value for the generic WIDTH, usually 
specified with a configuration statement. Otherwise, a generic map is 
required as part of the component instantiation. 
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You may Bel Ihe generic with any generic mapping method you 
choose. Set the WIDTH generic alter consulting The Programmable 
Logic Datu Book lor the particular part and mode you have imple¬ 
mented. For example, an XC4000E part can vary from 10 ms to 130 
ms. Use the TPOR (Power-On Reset) parameter found in the Configu¬ 
ration Switching Characteristics tables for Master, Slave, and Periph¬ 
eral modes. 

VHDL TOC Cell (Case A1) 

For Case Al, use the TOC (Tristate-On-Configuration) instantiated 
component. This component creates a one-shot pulse for the global 
Tristate-On-Configuration signal. The pulse width is a generic and 
can be selected to match the device and conditions you want. The 
TOC cell is in the post-routed netlist and, with the same pulse width 
set. it mimics the pie-route Tristate-On-Configuration net. 

TOC Cell Instantiation (Case Al) 

The following is an example of how to use the TOC cell. 

Note: The TPOR parameter from The Programmable logic Data Book Ls 
used as the WIDTH parameter in this example. 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_unsigned.all; 
library UNISIM; 
use UNISIM.all; 
entity EX_TOC is 

port {CLOCK, ENABLE : in std_logic; 

CUP, CDOWN : out std_logic_vccter (3 dewnto 0)); 
end EX_TOC; 

architecture A of EX_TCC is 

signal GSR, GTS : std_logic; 

signal CCUNT_UP, COUN7_.DC/rfN : std_logic_vcctcr (3 dovento 0); 
component RCC 

port (O : out std_logic); 
end corcponent; 
component TCC 

port (O : out std_logic); 
end component; 
begin 

Ul : RCC port map IO => GSR); 

U2 : TCC port map IO => GTS); 
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UP_COUNTER : process (CLOCK, ENABLE, GSR) 

begin 

if {GSR = M*) then 

COUNT_UP <= "0000"; 

eisif {CLOCK*event AND CLOCK = MM then 
if (ENABLE = then 

CD UN T_UP <= COUNT_UP - "0001"; 
end if; 
end if; 

end process UP_CCUNTER; 

DCWN — COUNTER : process |CLOCK. ENABLE, GSR, COUNT_COKN) 

begin 

if {GSR a 9 V OR COUNTDOWN = "0101") then 
COUNTDOWN <= "1111"; 
eisif (CLOCK*event AND CLOCK = M * ) then 
if (ENABLE » MM then 

COUNTDOWN <= COUNT.DOKN - "0001*; 
end if; 
end if; 

end process DCWN_COUNTER; 

CUP <= CCUNT_UP when {GTS = '0' AND COUNT_UP /= "0000") else "ZZZZ"; 

CDOMN <= COUNT_DCWN when {GTS = * 0*) else "ZZZZ"; 
end A; 


TOC Test Bench (Case A1) 

The following is the test bench for the TOC example. 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use IEEE.std_lcgic_unsigncd.all; 


library UNISIM; 
use UNISIM.all; 


entity tcst_ofcx_toc is end test_ofcxtoc; 
architecture inside of test_ofcx_toc is 
Component cx_tcc 

Port < CLOCK, ENABLE: in STD_LOGIC; 

CUP, CD OWN: out 3TD_LOSIC_VECTOR {3 downto 0) ) ; 
End conponcnt; 
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Begin 

UUT: cx_tcc port map(. . . .); 


End inside; 

The best method for mapping the generic is a configuration in the test 
bench, as shown in the following example. 

Configuration overall of test_ofcxtoc is 
For inside 

For UUT:cx_toc 
For A 

For Ul:TOC use entity UNXSXM.TOC 

<TOC_V) 

Generic map |WIDTH=>52 ns); 

End for; 

End for; 

End for; 

End overall; 

This configuration is for pre-NGDBuild simulation. A similar config¬ 
uration is used for post-NGDBuild simulation. The ROC. TOC. and 
OSC4 are mapped to the WORK library, and corresponding architec¬ 
ture names may be different. Review the .vhd file created by 
NGD2VHDL for the current entity and architecture names for post- 
NGDBuild simulation. 

TOC Model in Four Design Phases (Case A1) 

Tine following figure shows line progression of the TOC model and ils 
interpolation in the four main design pltases. 
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Figure 5-5 TOC Simulation and Implementation 

• Behavioral Phase—In this phase, the behavioral or RTL descrip¬ 
tion of the output buffers are inferred from the coding style. The 
TOC cell can be instantiated. If it is not instantiated, the GTS 
signal is not driven during simulation or is driven within the 
architecture by code that cannot be synthesized. Some synthe¬ 
sizes can infer which of the local output Instate enables is best 
for the global signal, and will insert the TOC cell automatically so 
instantiation may not be requited unless RTL level simulation is 
desired. The synthesizer may also allow you to select the output 
tristate enable line you want driven by the TOC cell. Instantiation 
of the TOC cell in the RTL description is recommended because 
you can immediately identify what signal is the global signal. 
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and you are nol relying on a synthesis tool feature that may not 
be available if ported to another tool. 

• Synthesized Phase—In this phase, the inferred registers are 
mapped to a device, and the TOC instantiation is either carried 
from the RTL or is inserted by the synthesis tools. This results in 
maintaining consistent global output tristate enable behavior 
between the RTL and the synthesized structural descriptions 
during simulation. 

• Implemented Phase—During implementation, the TOC is 
removed from the logical description that is placed and muted 
because it is a pre-existing circuit on the chip. The TOC is 
removed by making the input and output of the TOC cell appear 
as an open circuit. This allows the router to remove all nets 
driven by the TOC cell as if they were undriven nets. The V'HDL 
netlist program assumes all output tristate enables are driven by 
the global output Instate enable so data is not lost. 

• Back-annotation Phase—In this phase, the VHDL netlist tool re¬ 
inserts a TOC component for simulation purposes. The GTS net is 
a fully wired net and the TOC cell is inserted to drive it. You can 
use a configuration similar to the VHDL configuration for RTL 
simulation to set the generic for the pulse width. 

Using VHDL TOCBUF (Case B1) 

For Case Bl, use the TOCBUF (TrLstate-On-Configuration Buffer) 
instantiated component model. ThLs model creates a buffer for the 
global output tristate enable signal. You now have an input port on 
the buffer to drive the global set reset line. The implementation 
model directs the place and route tool to remove the port so it is not 
implemented on the actual chip. The TOCBUF cell does not reappear 
in the post-routed netlist. Instead, you can select an option on the 
implementation tool to add a global output tristate enable port to the 
back-annotated netlist. A buffer is not necessary because the imple¬ 
mentation directive is no longer required. 

TOCBUF Model Example (Case Bl) 

Tine following is an example of the TOCBUF model. 

library IEEE; 

use IEEE.std_lcqic_1164.all; 

use IEEE.std_lcqlc_unsiqncd.all; 
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library UNISIM; 
use UNISIM.all; 
entity EX_TCCBUF is 

pert (CLOCK, ENABLE, SRP, STP : in std_logic; 

CUP, CDOWN : out std_logic_vcctor <3 dewnto 0)\; 
end EX_TOCBUF; 

architecture A of EX_TCCBUF is 
signal GSR, GTS : std_logic; 

signal COUNT_UP, COUNT_DC'WN : std_logic_vectcr (3 downto 0); 
component RCCBUF 

port (I : in std_iogic; 

O : out std_logic); 
end component; 
component TCCBUF 

port (I : in std_logic; 

O : out std_logic); 
end component; 
begin 

Ul : RCCBUF port map II => SRP, O => GSR>; 

U2 : TCCBUF port map II => SIP, O => GTS>; 

UP_COUNTER : process ICLCCK, ENABLE, GSR) 
begin 

if (GSR « # l f > then 

COUNT_UP <= "0000"; 

clsif (CLOCK*event AND CLCCK = '1') then 
if (ENABLE = MM then 

COUNT_UP <= COUNT_UP * "0001"; 
end if; 
end if; 

end process UP_CCUNTER; 

DOWN JTCXJIf TER : process |CLOCK, ENABLE, GSR, COUNTJX5KNJ 
begin 

if (GSR = *1* OR COUNT_DOKN = "0101*> then 
COUNTDOWN <= "illl"; 
elsif (CLOCK*event AND CLCCK = *1*) then 
if (ENABLE « * l'\ then 

COUNT_DOWN <= COUNT_DOKN - "0001"; 
end if; 
end if; 

end process DCWN_COUNTER; 

CUP <= COUNT_UP when IGT5 = '0' AND COUNT_UP /= *0000") else H ZZZZ"; 
CDOKN <= COUN7_DCWN when (GTS = 'O') else "ZZZZ"; 
end A; 


5-12 


Xilinx Development System 




Simulating Your Design 


TOCBUF Model In Four Design Phases (Case B1) 

The following figure shows the progression of the TOCBUF model 
and ils interpretation in the four main design phases. 




Figure 5-6 TOCBUF Simulation and Implementation 

• Behavioral Phase—In this phase, the behavioral or RTL descrip¬ 
tion of the output buffers are inferred from the coding style and 
may be inserted. You can instantiate the TOCBUF cell. If it Ls not 
instantiated, the GTS signal is not driven during simulation or it 
is driven within the architecture by code that cannot be synthe¬ 
sized. Some synthesizers can infer the local output trislate 
enables that make the best global signals, and will insert the 
TOCBUF cell automatically. As a result, instantiation may not be 
required unless you want RTL level simulation. The synthesizer 
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can allow you lo select the output tiistate enable line you want 
driven by the TOCBUF cell. Instantiation of the TOCBUF cell in 
the RTL description is recommended because you can immedi¬ 
ately identity which signal is the global signal and you are not 
relying on a synthesis tool feature that may not be available if 
ported to another tool. 

• Synthesized Phase—In this phase, the inferred output buffers 
are mapped lo a device and the TOCBUF instantiation is either 
carried from the RTL or is inserted by the synthesis tools. This 
maintains consistent global output tristate enable behavior 
between the RTL and the synthesized structural descriptions 
during simulation. 

• Implemented Phase—In this phase, the TOCBUF is removed 
from the logical description that is placed and routed because it is 
a pre-existing circuit on the chip. 

Tine TOCBUF is removed by making the input and output of the 
TOCBUF cell appear as an open circuit. This allows the router to 
remove all nets driven by the TOCBUF cell as if they were 
undriven nets. The VHDL netlist program assumes all output 
tristate enables are driven by the global output tristate enable so 
data is not lost. 

• Back-annotated Phase—In this phase, the TOCBUF cell does not 
reappear in the post-routed netlist. Instead, you can select an 
option in the implementation tool to add a global output tristate 
enable port to the back-annotated netlist. A buffer is not neces¬ 
sary because the implementation directive is no longer required. 
If the option is not selected, the VHDL netlist tool re-inserts a 
TOCBUF component for simulation purposes. Tire GTS net is a 
fully wired net and the TOCBUF cell is inserted to drive it. You 
can use a configuration similar to the VHDL configuration used 
for RTL simulation to set the generic for the pulse width. 

Using Oscillators (VHDL) 

Oscillator output can vary within a fixed range. This cell is not 
included in the SimPrim library because you cannot drive global 
signals in VHDL designs. Schematic simulators can define and drive 
global nets so the cell is not required. Verilog has the ability to drive 
nets within a lower level module as well. Therefore the oscillator cells 
are only required in VHDL. After back-annotation, their entity and 
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architectures arc contained in your design's VHDL output For func¬ 
tional simulation, they can be instantiated and simulated with the 
UniSim Library. 

Tile period of the base frequency must be set in order for the simula¬ 
tion to proceed, since the default period of 0 ns disables the oscillator. 
The oscillator's frequency can vary significantly with process and 
temperature. 

Before you set the base period parameter, consult The Programmable 
Logic Dalu Boat for the part you are using. For example, the section in 
The Programmable logic Data Boat for the XC4000 Series On-Chip 
Oscillator states that the base frequency can vary from 4MHz to 10 
MHz, and is nominally 8 MHz. This means that the base period 
generic "period. 8m" in the XC4000E OSC4 VHDL model can range 
from 250ns to 100ns. An example of this follows. 

Oscillator VHDL Example 

library IEEE; 

use IEEE.std_lcgic_l164.all; 
use IEEE. st d_l eg ic__un signed.all; 

library UNISIM; 
use UNISIM.all; 

entity test 1 is 

port (DATAIN: in STD_LCGIC; 

DATACCI: out STD_LOGIC»; 
end testif 

architecture inside of tcstl is 

signal RST: STD_LCGIC; 

component ROC 

port (O: out STD_LCX3IC) ; 

end conponcnt; 

component OSC4 
port(F8M: out 5TD_LCGIC); 
end conponcnt; 

signal into male lock: STD_LCGIC; 
begin 
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UO: ROC pert map (R3T); 

Ul: OSC4 port map <Efl.M=>intcrnaiclockJ ; 

process(internalclock) 
begin 

if IRST='1'J then 
DATACCT <= 'O'; 

eisif<intcrnalclock*event and internalclock=' 1 ' ) then 
DATAcer <= DATAIN; 

end if; 

end process; 

end inside; 


Oscillator Test Bench 

library IEEE; 

use IEEE.std_lcgic_1164.all; 
use 2 EEE.std_leg ic__un signed.ail; 

library UNISIM; 
use UNISIM.all; 

entity tcst_oftcst 1 is end test_cftest 1 ; 

architecture inside of tcst — oftestl is 

component tcstl 

port(DATAIN: in STD_LOGIC; 

DATACei: out STD_LOGICJ; 
end conponcnt; 

signal userdata, userout: STD_LOGIC; 
begin 

UUT: tcstl port nap(DATAIN=>userdata,DArAOUT=>uscrout); 

myinput: process 
begin 

userdata <= ' 1 '; 
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wait for 299 ns; 
userdata <= 'O'; 
wait for 501 ns; 
end process; 

end inside; 

configuration overall of t qs t_oftest 1 is 
for inside 

for UUT:test 1 

for inside 

for U0:ROC use entity UNISIM.ROC(RGC_V) 
generic nap (K1DTH=> 52 ns); 
end for; 


for Ul:OSC4 use entity UMXSIM.OSC4(OSC4_V) 
generic nap <PERIOD_flH=> 25 ns>; 
end for; 
end for; 
end for; 
end for; 
end overall; 

This configuration is for pre-NGDBuild simulation. A similar config¬ 
uration is used for post-NGDBuild simulation. The ROC. TOC. and 
OSC4 are mapped to the WORK library, and corresponding architec¬ 
ture names may be different. Review the .vhd file created by 
NGD2VHDL for the current entity and architecture names for post- 
NGDBuild simulation. 

Compiling Verilog Libraries 

For some Verilog simulators, such as NC-Verilog and .VlodelSim. you 
may need to compile the Verilog libraries before you can use them for 
design simulations. A pre-compiled library methodology has the 
advantage of speeding up the simulation of your designs. You do not 
need to compile the libraries for Verilog-XL because it uses an inter¬ 
pretive compilation of the libraries. To simulate Xilinx designs, you 
need the following simulation libraries. 

• UniSim Library—Tine UniSim library is used for behavioral 
|RT1.) simulation with instantiated components in the netlist, and 
for post-synthesis (pre-Ml) simulation. Tine Verilog library has 
separate libraries for each device family: uni3000. UniSinns 
(XC4000E/L/X, Spartan/XL. and Virtex). uni5200, uni‘«XK). 
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• LogiBLOX Library—The LogiBLOX library is used lor designs 
containing LogiBLOX components, during pre-synthesis (RTL). 
and post-synthesis simulation. Verilog uses SimPrim libraries. 

• SimPrim Library—Tlie SimPrim library is used lor post 
Ngdbuild (gate level lunctional), post-Map (partial timing), and 
post-place-and-route (lull timing) simulations. This library is 
architecture independent. 

Compiling Libraries for ModelSim 

For detailed instructions on compiling these simulation libraries, see 
the instructions in Xilinx Solution # 1923 which is available at http:// 
www.xilinx.com/techdocs/ 1923.htm. 

Alter compiling the libraries, notice that ModelSim creates a lile 
called modelsim.ini- View this lile and notice that the upper portion 
delines the locations ol the compiled libraries. When doing a simula¬ 
tion, you must provide the modelsim.ini lile either by copying the lile 
directly to the directory where the HDL files are to be compiled and 
the simulation is to be run. or by setting the MODELSIM environ¬ 
ment variable to the location ol your master .ini file. You must set this 
variable since the ModelSim installation does not initially declare the 
path lor you. For UNIX, type the lollowing. 

setenv MODELSIM /path/lo/the/models im. ini 

Setting Verilog Global Set/Reset 

For Verilog simulation, all behaviorally described (inlerred) and 
instantiated registers should have a common signal which asynchro¬ 
nously sets or resets the register. You must toggle the global set/reset 
signal'(GSR lor XC4000E/L/X, Spartan/XL, and Virtex designs, or 
GR lor XC5200, XC3000A/L. or XC3100A/L designs). Toggling the 
global set/reset emulates the Power-On-Reset ol the FPGA. II you do 
not do this, the llip-llops and latches in your simulation enter an 
unknown state. 

The GSR signal in XC4000E/L/X, Spartan/XL, and Virtex devices, 
and the GR signal in XC520I1 devices are active High. The GR signal 
in XC3000A/L and XC3100A/L devices are active Low. 

The global set/reset net is present in your implemented design even 
il you do not instantiate the STARTUP block in your design. The 
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function of STARTUP is to give you the option to control the global 
reset net from an external pin. 

If you want to set the global set/reset pulse width so that it reflects 
the actual amount of time it takes for the chip to go through the reset 
process when power is supplied to it. refer to Tie Programmable Logie 
Data Book for the device you are simulating. Tire duration of the pulse 
is specified asTpQ R (Power-On-Reset). 

Tire general procedure for specifying global set/reset or global reset 
during a pre-NGDBuild Verilog UniSims simulation involves 
defining the global reset signals with the SXILINX/verilog/src/ 
glbl.v module. The VHDL UniSims library contains the ROC. 
ROCBUF, TOC, TOCBUF. and STARTBUF cells to assist in VITAL 
VHDL simulation of the global set/reset and tri-state signals. 
However, Verilog allows a global signal to be modeled as a wire in a 
global module, and, thus, does not contain these cells. 

Note: In the Xilinx software, the Verilog UniSims library is only used 
in RTL simulations of your designs. Simulation at other points in the 
flow use the Verilog SimPrims Libraries. 

Defining GSR in a Test Bench 

For pre-NGDBuild UniSims functional simulation, you must set the 
value of the appropriate Verilog global signals (glbl.GSR or glbl.GR) 
to the name of the GSR or GR net. qualified by the appropriate scope 
identifiers. 

The scope identifiers are a combination of the test module scope and 
the design instance scope. Tire scope qualifiers are required because 
the scope information is needed when the glbl.GSR and glbl.GR wires 
are interpreted by the Verilog UniSims simulation models to emulate 
a global reset signal. 

For post-NGDBuild and post-route timing simulation, the testfixture 
template (.tv file) produced by running NGD2VER with the -tf 
option contains most of the code previously described for defining 
and toggling GSR or GR. 

Use the following steps to define the global set/reset signals in a 
testfixture for your design. 
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Note: In the following steps, testfitlure name refers to the test fixture 
module name and instance name refers to the designated instance 
name for the instantiated design netlist within the test bench. 

1. For Verilog simulation without a STARTUP block in design, 
Xilinx recommends naming the global set/reset net to 
testfitlure name.inslance_nanie.GSR or 

test fixture name.instance iii7me.GR (Verilog is case-sensitive), and 
the signal should be declared as a Verilog reg data-type. 

2. For Verilog simulation with a STARTUP block in the design, the 
GSR/C.R pin is connected to an external input port, and 
glbl.GSR/glbl.GR is defined within the STARTUP block to make 
the connection between the user logic and the global GSR/GR net 
embedded in the Unified models. For post-NGDBuild functional 
simulation, post-Map timing simulation, and post-route timing 
simulation. glbl.GSR/glbl.GR is defined in the Verilog netlist that 
is created by NGD2VER. 

Tire signal you toggle at the beginning of the simulation is the 
port or signal in your design that is used to control global set/ 
reset. This is usually an external input port in the Verilog netlist. 
but it may also be a wire if global reset is controlled by logic 
internal to your design. 

3. When invoking Verilog-XL, or ModelSim to run the simulation, 
compile the Verilog source files in any order since Verilog is 
compile order independent. However, Xilinx recommends that 
you specify the test fixture file before the Verilog netlist of your 
design, as in the following examples. 

• Cadence Verilog-XL 

For RTL simulation, enter the following. 

verilog -y SXILINX/verilog/src/unisitns 
design . atim design, v SXILINX/verilog/src/glbl.v 

Tire path specified with the -y switch points the simulator to 
the UniSims models and is only necessary if Xilinx primitives 
are instantiated in your code. When targeting a device family 
other than the XC4000E/L/X. Spartan/XL. or Virtex families, 
change the unisims reference in the path to the targeted 
device family. 

For post-implementation simulation, enter the following. 
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verilog design, stica lime_sim. v SXILINX/verilog/ 
src/glbl.v 

In this example, the same test fixture file is declared first 
followed by the simulation netlist created by the Xilinx tools. 
The name of the Xilinx simulation netlist may change 
depending on how the file was created. It is also assumed 
that the -ul switch was specified during NGD2VER to 
specify the location of the SimPrims libraries using the 
'uselib directive. 

• Mil ModelSim 

For RTL simulation, enter the following. 

vlog design, atica design.v $XILINX/vetilog/src/ 
glbl.v 

vsicn -L unisims lestfixlurejiameglbl 

Tills example targets the XC4000E/L/X, Spartan/XL, or 
Virtex families and assumes the UniSims libraries are prop¬ 
erly compiled and named unisims. For more information on 
the compilation of the ModelSim libraries, refer to http:// 
www.xilinx.com/techdocs/ 1923.htm 

For post-implementation simulation, enter the following. 

vlog design. st im lime sim.v $XILINX/verilog/src/glbl.v 

vsicn -L sienpeitns lesl/ixlure. namegIN 

Tills example is based on targeting the SimPrims libraries, 
which have been properly compiled and named simprints. 
Also, the name of the simulation netlist may change 
depending on how the file is created. 

Note: Xilinx recommends giving the name lesl to the main module in 
the test fixture file. This name is consistent with the name of the test 
fixture module that is written later in the design flow by NGD2VER 
during post-NGDBuild, post-MAP, or post-route simulation. If this 
naming consistency is maintained, you can use the same test fixture 
file for simulation at all stages of the design flow with minimal modi¬ 
fication 
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Designs without a STARTUP Block 

If you do nol have a STARTUP block in your design, you should add 
Ihe following to the test fixture module. 

• XC4000E/L/X, Spartan/XL, or Virtex devices, 
rey GSR; 

assign glbi.GSR - GSR; 

assign testfixlurejtame.instance_name. GSR - GSR; // Only 
for RTL nodding of GSR 

• XC5200, XC3000A/L. and XC3100A/L devices, 
reg GR; 

assign glbl.GR - GR; 

assign lest fixture iMme.itisUmcejmme .GR - GR; // Only 
for RTL nodding of GR 

For post-NGDBuild functional simulation, post-Map timing simula¬ 
tion. and post-route timing simulation, you must omit the assign 
statement for the global reset signal. This is because the net connec¬ 
tions exist in the post-NGDBuild design, and retaining the assign 
definition causes a possible conflict with these connections. 

Note: The terms "test bench" and "test fixture" are used synony¬ 
mously throughout this manual. 

Example 1: XC4000E/L/X, Spartan/XL, or Virtex RTL 
Functional Simulation (No STARTUP/ 
STARTUP_VIRTEX Block) 

Tire following design shows how to drive the GSR signal in a testfix- 
ture file at the beginning of a pre-NGDBuild Unified Library func¬ 
tional simulation. 

You should reference the global set/reset net as GSR in XC40I10E/L/ 
X. Spartan/XL, or Virtex designs without a STARTUP/ 

STARTUP VIRTEX block. The Verilog module defining the global net 
must be referenced as glbi.GSR because this is how it is modeled in 
the Verilog UniSims library. 

In the design code, declare GSR as a Verilog wire, however, it is not 
specified in Ihe port list for the module. Describe GSR to reset or set 
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every intern'd register or latch in your design. GSR does not need to 
be connected to any instantiated registers or latches, as shown in the 
following example. 

module my^countcr {CLK, D r 0/ CCUTJ; 
input CLK, D; 
output Q; 

output [3:0! CCXJT; 


vire GSR; 
reg [3:01 COUT; 


always @{poscdgc GSR or poscdqc CLK) 
begin 

if (GSR == l'bl) 

COUT = 4 * hO; 
else 

COUT = CCOT + l'bl; 


// FDCE instantiation 

// GSR is modeled as a wire within a global ncdulc. So, 
// CLR docs not need tc be connected to GSR and the flop 
// will still be reset with GSR. 


FDCE U0 <.0 (QI r -D <D), .C {CLK), .CE ll’bl). .CLR (1'bOj); 


Since GSR is declared as a floating wire and is not in the port list, the 
synthesis tool optimizes the GSR signal out of the design. GSR is 
replaced later by the implementation software for all post-implemen¬ 
tation simulation netlists. 
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In the test fixture file, set GSR to test.uut.GSR (the name of the global 
set/reset signal, qualified by the name of the design instantiation 
instance name and the test fixture instance name). Since there is no 
STARTUP block, a connection to GSR is made in the testfixtun? via an 
assign statement. 

'tincscale 1 ns / 1 ps 
module test; 
reg CLK r D; 
wire 0 ; 

wire (3:0) COUT; 

reg GSR; 

assign glbl.GSR = GSR; 
assign test.uut.GSR = GSR; 

my — counter uut <.CLK ICLK), .D ID ), .0 10). .COUT (CCCTJ); 

initial begin 
St irreformat (-9, 1, "ns", 12) ; 

$display ( u \t TCGDQC*); 

Sdisplay( u \t i L S O*); 

Sdisplay ( u \t n K R U*); 

Sdisplay ( tt \t c T*) ; 

Smonitor( u %t 4b 4b 4b 4b 4h", Stine, CLK, GSR, D, Q, COUT); 
end 

initial begin 
CLK = 0; 

forever #25 CLK = -CLX; 

end 
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initial begin 

tO |G5R, Dj = 2'bll; 

1100 {GSR, D> = 2'blO; 

tlOO {GSR, D) = 2'b00; 

tlOO {GSR, D) = 2'bOi; 

tlOO $finish; 


In this example, the active high GSR signal in the XC40RKO family 
device is activated by driving it high. 100 ns later, it is deactivated by 
driving it low. (1(X) ns is an arbitrarily chosen value.) 

You can use the same test fixture for simulating at other stages in the 
design flow if this methodology is used. 

Example 2: XC5200 RTL Functional Simulation (No 
STARTUP Block) 

For pre-NGDBuild functional simulation, the active High GR net in 
XC5200 devices should be simulated in the same manner as GSR for 
XC4000E/L/X, Spartan/XL, or Virtex. 

In the design code, declare GR as a Verilog wire, however, it is not 
specified in the port list for the module. Describe GR to reset or set 
every inferred register or latch in your design. GR does not need to be 
connected to any instantiated registers or latches, as shown in the 
following example. 

module my_countcr (CLK, D, 0. CCUTJ j 
input CLK, D; 
output 0 ; 

output (2:01 CCXJT; 


wire GR; 

reg |3:01 COOT; 
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always @tposcdgc GR or poscdgc CLKJ 
begin 

if <GR == 1'bl) 

COUT = 4' hO; 
else 

COUT = CCCT 4- l'blj 

end 

// FDCE instantiation 

// GR is rrodelod as a wire within a global module. So r 
// CLR does not need to be connected to GR and the flop 
// Mill still be reset with GR. 


FDCE UO <.Q <Q», .D <D), .C <CLK», . CE U'bl). .CLR (l'bOj); 


Since GR is declared as a floating wire and is not in the port list, the 
synthesis tool optimizes the GR signal out of the design. GR is 
replaced later by the implementation software for all post-implemen¬ 
tation simulation netlists. 

In the lest fixture file, set GR to test.uut.GR (the name of the global 
set/reset signal, qualified by the name of the design instantiation 
instance name and the test fixture instance name). Since there is no 
STARTUP block, a connection to GR is made in the testfixture via an 
assign statement. 

'tincscale l ns / 1 ps 
module test; 
reg GR; 

assign glbl.GR = GR; 
assign test.uut.GR = GR; 
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initial begin 

GR - 1; // if you wish to reset/set the device; 

*100 GR o 0; // deactivate GR 

end 

In this example, the active high GR signal in the XC5200 family 
device is activated by driving it high. 100 ns later, it is deactivated by 
driving it low. (100 ns is an arbitrarily chosen value.). 

You can use the same test fixture for simulating at other stages in the 
design flow if this methodology is used. 

Example 3: XC3000A'L, or XC3100A/L RTL Functional 
Simulation (No STARTUP Block) 

For pre-NGDBuild functional simulation, asserting global reset in 
XC3000A/L or XC3100A/L designs is almost identical to the proce¬ 
dure for asserting global reset in XC52O0 designs, except that GR is 
active Low. 

Note: The STARTUP block is not supported on XC300I1A/L or 
XC3100A/L devices. 

In the design code, declare GR as a Verilog wire, however, it is not 
specified in the port list for the module. Describe GR to reset or set 
every inferred register or latch in your design. GR does not need to be 
connected to any instantiated registers or latches, as shown in the 
following example. 

module oy_coun;cr (CLK, D. 0. CCUTJ; 

Input CLK, D; 
output 0 ; 

output [3:01 CCXJT; 

wire GR; 

re q 13:01 COOT; 

always @<ncgcdgc GR or posedge CLK) 
begin 

if (GR == 1'bO) 
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COUT = 4 • hO; 
else 

COUT = CCXJT l'bl; 

end 

// FDCE instantiation 

// GR is rrodeled as a wire within a global module. So, 
// CLR does not need to be connected to GR and the flop 
ft will still be reset with GR. 


FDCE UO <.Q <Q), .D <D» , .C <CLK» # .CE ll'blj, .CLR (l'bO)>; 


Since GR Ls declared as a floating wire and Ls not in the port list, the 
synthesis tool optimizes the GR signal out of the design. Although 
this is correct in the hardware, it is aclually an implicit connection, 
and not listed in the netlist (XNF or EDIF). GR is replaced later by the 
implementation software for all post-implementation simulation 
netlists. 

In the test fixture file, set GR to test.uut.GR (the name of the global 
set/reset signal, qualified by the name of the design instantiation 
instance name and the test fixture instance name). Since there is no 
STARTUP block, a connection to GR Ls made in the lestfixlure via an 
assign statement. 

'tincscale 1 ns / 1 ps 
module test; 
reg GR; 

assign glbl.GR - GR; 
assign test.uut.GR = GR; 


initial oegm 
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GR =0; // 1 i you wish to reset/set the device; 

1100 GR = 1; // deactivate GR 

end 

In this example, the active Low C.R signal in the XC3000A/L and 
XC3100A/L family device is activated by driving it high. 100 ns later, 
it is deactivated by driving it low. (100 ns is an aibitrarily chosen 
value.). 

The Global Reset (GR) signal in the XC3000A/L and XC3100A/L 
architecture is modeled differently in functional simulation netlisLs 
and SimPrims library-based netlisLs generated by NGD2VER. In the 
Verilog Unified Library, GR is modeled as a wire within a global 
module, while in a SimPrims-based netILst, it is always modeled as an 
external port. As a result, you cannot use the same test bench file for 
both Unified library simulation and SimPrims-based simulation. 

Designs with a STARTUP Block 

If you do have a STARTUP block in your design, the signal you toggle 
is the external input port that controls the global reset pin of the 
STARTUP block. You should add the following to the test fixture 
module for RTL modeling of the global reset pin. 

Note: The terms "test bench" and "test fixture” are used synony¬ 
mously throughout this manual. 

• XC4000E/L/X, Spartan/XL. and Virtex devices, 
teg port_connected toGSRpin; 

• XC5200 devices. 

teg por(_connected fo_.GR. pin. 

For post-NGDBuild functional simulation, post-map timing simula¬ 
tion, and post-route timing simulation, you must omit the assign 
statement for the global reset signal. This is because the net connec¬ 
tions exist in the post-NGDBuild design, and retaining the assign 
definition causes a possible conflict with these connections. 

By default for XC4IW0E/L/X, XC5200, Spartan/XL, and Virtex 
devices, the GSR/GR pin is active High. To change the polarity of 
these signals in your Verilog code, instantiate or infer an inverter to 
the net that sources the GSR/GR pin of the STARTUP block. 
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Figure 5-7 Inverted GSR 

The inversion is absorbed inside the STARTUP block, a function 
generator Ls not used to generate the inverter. 

In the following Verilog code, GSR Ls listed as a top-level port. 

module my_countcr tWYGSR, CLK, D, Q, COUT); 
input MYGSR* CLK, D; 
output 0 ; 

output l3:0| CGU7; 


roq 13:01 CDUT; 


vice 1MV_GSR; 

assign XNV.GSK - 1MYGSR; // Invortod GSR 


// Modeling invortod GSR with P.TL coda 
always ^ (posedge I!JV_GSR or posodgo CLK | 
begin 

if <rrnr_GSR == l r bl) 

COUT = 4'n0/ 
also 

COUT = COUT + 1 J bli 

end 
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// FDCE instantiation 

// GSB is mooclcd as a wire within a global nodule. Go, 
// CLB does not need to be connoctoa to G5B and the flop 
// will still be reset with G5R. 


POCB 00 < .0 (OK .D <D| , .C <CLK». .CE ll'blj, .CLR <l'bO)>; 
STARTUP U1 I .GSR <IHV_GSR>, .GTS <l'bO>, .CLK <l*bO)»; 


Example 1: XC4000E/L/X and Spartan/XL Simulation 
with STARTUP, or Virtex with STARTUP_VIRTEX 

In the following figuie, MY GSR is an external user signal that 
controls C.SR. 
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Figure 5-8 Verilog User-Controlled GSR 

In the following Verilog code, GSR is listed as a top-level port. 
Synthesis sees a connection of GSR to the STARTUP and as well to the 
behaviorallv described counter. Although this is correct in the hard¬ 
ware, it is actually an implicit connection, and GSR is only listed a.-* a 
connection to the STARTUP in the netllst (XNF and EDIF). 

module my_countcr {MY GSR, CLK, D, Q, COUT); 

input XYG5R, CLK, D; 

output 0 ; 

output 13:0) COOT/ 
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rcg 13:01 COC7T; 

always @<poscdge MYGSR or pcscdge CLK) 
begin 

if (MYGSR == l'blj 
COUT = 4'h0; 
else 

COUT = CCXJT +■ l'bl; 

end 

// FDCE instantiation 

// GSR is modeled as a wire within a global nodule. So, 

// CLR docs not need to be connected to GSR and the flop 
// will still be reset with GSR. 

FDCE U0 <.Q <01, .D <D), .C <CLK), .CE ll'bi), .CLR (1'bOj); 

STARTUP U1 I.GSR (MYGSR), .GTS Il'bO), .CLK (1'bOHj 

endncdule 

The following is an example of controlling the global set/reset signal 
by driving the external MYGSR input port in a test fixture file at the 
beginning of an RTL or post-synthesis functional simulation when 
there is a STARTUP block in XC4000E/L/X and Spartan/XL designs, 
or the STARTUP.. VIRTEX in Virtex. 

The global set/reset control signal should be toggled High, then Low 
in an initial block. 

'tincscalc 1 ns / 1 ps 
module test; 
rcg GSR; 
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initial begin 

GSR = 1; ft if you wish to rcsot/sot the device; 

*100 G5R = 0; // deactivate GSR 

end 

In addition, a Venlog global signal called glbl.GSR is defined within 
Hit STARTUP/STARTUP VIRTEX block to make the connection 
between the user logic and the global GSR net embedded in ihe 
Unified models. For pral-NGDBuild functional simulation, post-Map 
timing simulation, and post-route timing simulation, glbl.GSR is 
defined in the Verilog netlist that is created by NGD2VER. 

Example 2: XC5200 Simulation with STARTUP 

For XC5200 designs with a STARTUP block, you should simulate the 
net controlling GR in the same manner as for the XC4000E/L/X, 
Spartan/XL, and Virtex. 

Substitute MYGR for MYGSR in Example 1 to obtain the testfixture 
fragment for simulating GR in a Verilog RTL or prat-synthesis simu¬ 
lation. 
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Figure 5-9 Verilog User-Controlled Inverted GR 

In addition, a Venlog global signal called glbl.GR is defined within 
the STARTUP block to make the connection between the user logic 
and the global GR net embedded in the Unified models. For post- 
NGDBuild functional simulation, post-map timing simulation, and 
post-route timing simulation. glbl.GR is defined in the Verilog netlist 
that is created by NGD2VER. 
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Example 3: XC3000A/L and XC3100A/L Designs 

STARTUP is not supported or required in XC3000A/L and 
XC3100A/L designs. Follow the procedure lor XC3000A/L and 
XC3100A/L designs without STARTUP blocks. 

Setting Verilog Global Tristate (XC4000, Spartan, 
and XC5200 Outputs Only) 

XC40G0E/L/X, Sparlan/XL. Virtex. and XC5200 devices also have a 
global control signal (GTS) that tristates all output pins. This allows 
you to isolate the actual device part during board level testing. You 
can also tristate the FPGA device outputs during board level simula¬ 
tion to assist in debugging simulation. In most cases, GTS is deacti¬ 
vated so that the outputs are active. 

Although the STARTUP/STARTUP VIRTEX component also gives 
you the option ol controlling the global tristate net from an external 
pin, it is usually used for controlling global reset. In this case, you can 
leave the GTS pin unconnected in the design entry phase, and it will 
float to its inactive state level. The global tristate net, GTS, is imple¬ 
mented in designs even if a STARTUP/STAKTUP VIRTEX block is 
not instantiated. You can deactivate GTS by driving it low in your test 
fixture file, or by connecting the GTS pin to GND in your input 
design 

Defining GTS in a Test Bench 

For pre-NGDBuild UniSim functional simulation, you must set the 
value of the appropriate Verilog global signal, glbl.GTS, to the name 
of the GTS net, qualified by the appropriate scope identifiers. 

The scope identifiers are a combination of the test module scope and 
the design instance scope. The scope qualifiers are requited because 
the scope information is needed when the glbl.GTS wire is inter¬ 
preted by the Verilog UniSim simulation models to emulate a global 
tri-state signal. 

For post-NGDBuild and post-route timing simulation, the testfixutre 
template (.tv file) produced by running NGD2VER with the -tf 
option contains most of the code previously described for defining 
and toggling GTS. 
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The general procedure for specifying GTS is similar to that used for 
specifying the global sel/resel signals, GSR and GR. You define the 
global trislate signal with Verilog global module, glbl.GTS. If you do 
not want to specify GTS for simulation, you do not need to change 
anything in your design or testfixtuie. 

The GTS signal in XC40G0E/L/X, Spartan/XL, Virtex, and XC5200 
devices is active High. This global module is not used in timing simu¬ 
lation when there is a STARTUP/STARTUP VIRTEX block in your 
design and the GTS pin is connected. 

Designs without a STARTUP Block 

If you do not have a STARTUP block in your design, you should add 
the following to the test fixture module. 

ieg GTS; 

assign glbl.GTS - GTS; 

assign lestfiilure jiame.instance jtiime.GTS - GTS; 

// Only for RTL simulation modeling of GTS 

For post-NGDBuild functional simulation, post-map liming simula¬ 
tion, and post-route timing simulation, you must omit the assign 
statement for the global tri-state signal. This is because the net 
connections exist in the post-NGDBuild design, and retaining the 
assign definition causes a possible conflict with these connections. 

Note: The terms "test bench" and "test fixture" are used synony¬ 
mously throughout this manual. 

XC4000E/UX, Spartan XL, Virtex and XC5200 RTL 
Functional Simulation (No STARTUP Block) 

You can drive the GTS signal in a lest fixture file at the beginning of a 
pre-NGDBuild RTL or post-synthesis functional simulation. The 
global tristate net is named GTS in XC4000E/L/X. Spartan/XL, 
Virtex, and XC5200 designs. The Verilog module defining the global 
tri-state net must be referenced as glbl.GTS because this is how it is 
modeled in the Verilog UniSim library. 
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Designs with a STARTUP Block 

If you do have a STARTUP block in your design, the signal you toggle 
a I the beginning of simulation is the port or signal in your design that 
is used to control global instate. This is usually an external input port 
in the Verilog netlist, but can be a wire if global Instate is controlled 
by internal logic in your design. A Verilog global signal called 
glbl.GTS is defined within the STARTUP block to make the connec¬ 
tion between the user logic and the gltbal GTS net embedded in the 
Unified models 

Example 1: XC4000E/L/X, Spartan’XL. Virtex, and 
XC5200 Simulation (With STARTUP/ 
STARTUP_VIRTEX, GTS Pin Connected) 

In the following figure, MYGTS is an external user signal that 
controls GTS. 


stxpiup 



Figure 5-10 Verilog User-Controlled Inverted GTS 

The following is an example of controlling the global tri-state signal 
by driving the external MYGTS input port in a test fixture file at the 
beginning of an RTL or post-synthesis functional simulation when 
there is a STARTUP block in XC-ltXXlE/L/X and Spartan/XL design, 
or the STARTUP VIRTEX in Virtex. The global C7TS model in the 
UniSim simulation models for output buffers (OBUF, OBUFT, and so 
on). 

The global tri-state control signal should be toggled High, then Low 
in an initial block. 

module testj 
roq MYGTS; 
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initial begin 

MVGTS =1; // if you wish to tristatc the device; 

*100 MVCTT3 = 0; // deactivate GTS 

end 

Example 2: XC4000E/L/X, Spartan'XL, Virtex, and 
XC5200 Simulation (With STARTUP/ 
STARTUP_VIRTEX, GTS Pin not connected) 

A Verilog global signal called glbl.GTS is defined within the 
STARTUP/STARTUP VIRTEX block to make the connection between 
the user logic and the global GTS net embedded in the Unified 
models. For post-NGDBuild functional simulation, post-map timing 
simulation, and post-route timing simulation. glbl.GTS is defined in 
the Verilog netlist that is created by NGD2VER. 

module tent; 
roq GTS; 

assign glbl.GTS = GTS; 


initial begin 

GTS =1; // if you wish to tristatc the device; 
*100 GTS = 0; // deactivate GTS 

end 

Note: For post-route timing simulation, you can use the same test 
bench. 
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Accelerate FPGA Macros with One-Hot 
Approach 


By Steven K. Knapp 

Xilinx Inc. 

2100 Logic Dr. 

San Jose. CA 95124 

Reprinted with permission from Electronic Design, September 13, 
1990. <D Penton Publications. 

State machines—one of the most commonly implemented functions 
with programmable logic—are employed in various digital applica¬ 
tions, particularly controllers. However, the limited number of flip- 
flops and the wide combinatorial logic of a PAL device favors state 
machines that are based on a highly encoded state sequence. For 
example, each state within a 16 -state machine would be encoded 
using four flip-flops as the binary values between (X)00 and 1111. 

A more flexible scheme—called one-hot encoding (OHE)—employs 
one flip-flop per state for building state machines. Although it can be 
used with PAL-type programmable-logic devices (PLDs), OHE is 
better suited for use with the fan-in limited and flip-flop-rich archi¬ 
tectures of the higher-gate-count filed-programmable gate arrays 
(FPGAs), such as offered by Xilinx, Actel, and others. This is because 
OHE requires a larger number of flip-flops. It offers a simple and 
easy-to-use method of generating performance-optimized slate- 
machine designs because there are few levels of logic between flip- 
flops. 
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Figure A-1 A Typical Slate Machine Bubble 

(In reference lo Ihe figure above) A Typical Stale Machine Bubble 
diagram shows the operation of a seven-state state machine that 
reacts to inputs A through E as well as previous-state conditions. 



Figure A-2 Inverters 


(In reference to the figure above) Inverters are required at the D 
input and the Q output of the slate flip-flop to ensure that it powers 
on in the proper slate. Combinatorial logic decodes the operations 
based on Ihe input conditions and the state feedback signals. The 
flip-flop will remain in State 1 as long as the conditional paths out 
of the slates are not valid. 


A state machine implemented with a highly encoded state sequence 
will generally have many, wide-input logic functions to interpret the 
inputs and decode the states. Furthermore, incorporating a highly 
encoded state machine in an FPGA requires several levels of logic 
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between clock edges because multiple logic blocks will be needed for 
decoding the states. A better way to implement state machines in 
FPGAs is to match the state-machine architecture to the device archi¬ 
tecture. 

Limiting Fan-In 

A good state-machine approach for FPGAs limits the amount of fan- 
in into one logic block. While the one-hot method is best for most 
FPGA applications, binary encoding is still more efficient in certain 
cases, such as for small state machines. It's up to the designer to eval¬ 
uate all approaches before settling on one for a particular application. 



Figure A-3 The Seven States 

(In reference to the figure above) Of the seven states, the state-tran¬ 
sition logic required for Slate 4 is the most complex, requiring 
inputs from three other state outputs as well as four of the five 
condition signals (A - D). 

FPGAs are high-deasity programmable chips that contain a large 
array of user-configurable logic blocks surrounded by user-program¬ 
mable interconnects. Generally, the logic blocks in an FPGA have a 
limited number of inputs. The logic block in the Xilinx XC-3000 
series, for instance, can implement any function of five or less inputs. 
In contrast, a PAL macrocell is fed by each input to the chip and all of 
the flip-flops. This difference in logic structure between PALs and 
FPGAs is important for functions with many inputs: where a PAL 
could implement a many-input logic function in one level of logic, an 
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FPGA might require multiple logic layers due to the limited number 
of inputs. 

Tine OHE scheme is named so because only one state flip-flop is 
asserted, or "hot", at a time. Using the one-hot encoding method for 
FPGAs was originally conceived by High-Gate Design—a Saratoga, 
Calif.-based consulting firm specializing in FPGA designs. 

The OHE state machine's basic structure is simple—first assign an 
individual flip-flop to each state, and then permit only one state to be 
active at any time. A state machine with 16 states would require 16 
flip-flops using the OHE approach, a highly encoded state machine 
would need just four flip-flops. At first glance. OHE may seem 
counter-intuitive. For designers accustomed to using PLDs. more 
flip-flops typically indicates either using a larger PLD or even 
multiple devices. 

In an FPGA, however, OHE yields a state machine that generally 
requires fewer resources and has higher performance than a binary- 
encoded implementation. OHE has definite advantages for FPGA 
designs because it exploits the strengths of the FPGA architecture. It 
usually requires two or less levels of logic between clock edges than 
binary encoding. That translates into faster operation. Logic circuits 
are also simplified because OHE removes much of the state-decoding 
logic—a one-hot-encoded state machine is already fully decoded. 

OHE requires only one input to decode a state, making the next-state 
logic simple and well-suited to the limited fan-in architecture of 
FPGAs. In addition, the resulting collection of flip-flops is similar to a 
shift-register-like structure, which can be placed and routed effi¬ 
ciently inside an FPGA device. The speed of an OHE state machine 
remains fairly constant even as the number of states grows. In 
contrast, a highly encoded state machine's performance drops as the 
states grow because of the wider and deeper decoding logic that's 
required. 

To build the next-state logic for OHE state machine is simple, lending 
itself to a "cookbook" approach. At first glance, designers familiar 
with PAL-type devices may be concerned by the number of potential 
illegal states due to the sparse state encoding. This issue, to be 
discussed later, can be solved easily. 

A typical, simple state machine might contain seven distinct states 
tliat can be described with the commonly used circle-and-are bubble 
diagrams, see the "A Typical State Machine Bubble" figure. The label 
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above the line in each "bubble" is the stale's name. The labels below 
the line are the outputs asserted while the state is active. In the 
example, there are seven states labeled State 1-7. The "arts" that feed 
back into the same state are the default paths. These will be true only 
if no other conditional paths are true. 

Each conditional path is labeled with the appropriate logical condi¬ 
tion that must exist before moving to the next state. All of the logic 
inputs are labeled as variables A through E. The outputs from the 
state machine an? called Single, Multi, and Contig. For this example. 
State 1, which must be asserted at power-on, has a double-inverted 
flip-flop structure (shaded region of the “Inverters" figun?) 

Tire state machine in the example was built twice, once using OHE 
and again with the highly encoded approach employed in most PAL 
designs. A Xilinx XC3020-100 2000-gate FPGA was the target for both 
implementations. Though the OHE circuit required slightly more 
logic than the highly-encoded state machine, the one-hot state 
machine operated 17% faster (see the table). Intuitively, the one-hot 
method might seem to employ many more logic blocks than the 
highly encoded approach. But the highly encoded state machine 
needs more combinatorial logic to decode the encoded state values. 



Figure A-4 Only a Few Gates 

(In reference to the figure above) Only a few gates are required by 
Stales 2 and 3 to form simple state-transition logic decoding. Just 


Si/ii/JtesiS and Simulation Design Guide 


AS 




Synthesis and Simulation Design Guide 


two gales are needed by Stale 2 (lop), while four simple gales are 
used by Slale 3 (bollom). 

Tine OHE approach produces a slale machine with a shift-regisler 
structure lhal almost always outperforms a highly encoded slale 
machine in FPGAs. The one-slate design had only Iwo layers of logic 
between flip-flops, while Ihe highly encoded design had three. For 
other applications, the results can be far more dramatic. In many 
cases, the one-hot method yields a state machine with one layer of 
logic between clock edges. With one layer of logic, a one-hot state 
machine can operate at 50 to 60 MHz. 



Figure A-5 Looking Nearly the Same 

(In reference to the figure above) Looking nearly the same as a 
simple shift register, the logic for States 5,6, and 7 is very simple. 
This is because the OHE scheme eliminates almost all decoding 
logic that precedes each flip-flop. 

The initial or power-on condition in a state machine must be exam¬ 
ined carefully. At power-on, a state machine should always enter an 
initial, known state. For the Xilinx FPGA family, all flip-flops are reset 
at power-on automatically. To assert an initial state at power-on, the 
output from the initial-state flip-flop is inverted. To maintain logical 
consistency, the input to flip-flop also is inverted. 

All other states use a standard. D-type flip-flop with an asynchronous 
reset input. The purpose of the asynchronous reset input will be 
discussed later when illegal states are covered. 

Once the start-up conditions are set up. the next-state transition logic 
can be configured. To do that, first examine an individual state. H»en 
count the number of conditional paths leading into the state and add 
an extra path if the default condition is to remain in the same state. 
Second, build an OR-gate with the number of inputs equal to the 
number of conditional paths that were determined in the first step. 
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Third, lor each input of the OR-gate, build an AND-gate of the 
previous state and its conditional logic. Finally, if the default should 
remain in the same state, build an AND-gate of the present state and 
the inverse of all possible conditional paths leaving the present state. 

To determine the number of conditional paths feeding State 1, 
examine the state diagram—State 1 has one path from State 7 when¬ 
ever the variable E is true. Another path is the default condition, 
which stays in State 1. As a result, there are two conditional paths 
feeding State 1. Next, build a 2-input OR-gate—one input for the 
conditional path from State 7, the other for the default path to stay in 
State 1 (shown as OR-1 in the "Inverters" figure). 

Tire next step is to build the conditional logic feeding the OR-gate. 
Each input into the OR-gate is the logical AND of the previous state 
and its conditional logic feeding into State 1. State 7, for example, 
feeds State 1 whenever E is true and is implemented using the gate 
called AND-2, in the "Inverters” figure. Tire second input into the 
OR-gate is the default transition that's to remain in State 1. In other 
words, if the current state is State 1, and no conditional paths leaving 
State 1 are valid, then the state machine should remain in State 1. 
Note in the state diagram that two conditional paths are leaving State 
1, in the "A Typical State Machine Bubble" figure. 

Tire first path is valid whenever (A'B'C) is true, which leads into 
State 2. The second path is valid whenever (A'B'C) Ls true, leading 
into State 4. To build the default logic. State 1 is ANDed with the 
inverse of all the conditional paths leaving State 1. The logic to 
perform this function is implemented in the gate labeled AND-3 and 
the logic elements that feed into the inverting input of AND-3, in the 
"Inverters" figure. 



Figure A-6 S-R Flip-Flops 
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(In reference lo Ihe figuie above) S-R Flip-Flops offer another 
approach to decoding the Conlig output. They can also save logic 
blocks, especially when an output is asserted for a long sequence of 
contiguous states. 

State 4 is the most complex state in the state-machine example. 
However, creating the logic for its next-state control follows the same 
basic method as described earlier. To begin with. State 4 isn't the 
initial state, so it uses a normal D-type flip-flop without the inverters. 
It does, however, have an asynchronous reset input, three paths into 
the state, and a default condition that stays in State 4. Therefore, four- 
input OR-gate feeds the flip-flop fOR-1 in the 'The Seven States" 
figure). 

Tile first conditional path comes from State 3. Following the methods 
established earlier, an AND of State 3 and the conditional logic, 
which is A ORed with D. must be implemented (AND-2 and OR-3 in 
the "The Seven States" figure). The next conditional path is from State 
2 , which requires an AND of State 2 and variable D (AND-4 in the 
"The Seven States" figure). Lastly, the final conditional path leading 
into State 4 is from State 1. Again, the State-1 output must be ANDed 
with its conditional path logic—the logical product. A'B'C (AND-5 
and AND-6 in the "The Seven States" figure). 

Now, all that must be done is to build the logic that remains in State 4 
when none of the conditional paths away from State 4 are true. The 
path leading away from State 4 is valid whenever the product. 

A'B'C, is true. Consequently, State 4 must be ANDed with the 
inverse of the product. A'B'C. In other words, "keep loading the flip- 
flop with a high until a valid transfer to the next state occurs." The 
default path logic uses AND-7 and shares the output of AND-6. 


One-State vs. Binary Encoding Methods 

Method 

Number ol Logic 

Worst-case 

Blocks 

performance 

One-hot 


40 Mhz 

Binary encoding 

7.0 

34 Mhz 


Configuring the logic to handle the remaining states is very simple. 
State 2, for example, has only one conditional path, which comes 
from State 1 whenever the product A'B'C is true. However, the state 
machine will immediately branch in one of two ways from State 2, 
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depending on the value ol D. There's no default logic to remain in 
State 2, the "Only a Few Gates" figure. State 3, like States 1 and 4 has 
a default state, and combines the A, D. State 2. and State 3 feedback to 
control the flop-flop's D input in the "Only a Few Gates" figure. 

State 5 feeds State 6 unconditionally Note that the state machine 
waits until variable E is low in State 6 before proceeding to State 7. 
Again, while in State 7, the state machine waits for variable E to 
return to true before moving to State 1 in the "Looking Nearly the 
Same" figure. 

Output Definitions 

After defining all of the state transition logic, the next step is to define 
the output logic. The three output signals—Single, Multi, and 
Contig—each fall into one of three primary output types: 

1 . Outputs asserted during one state, which is the simplest case. 
Tire output signal Single, asserted only during State 6, is an 
example. 

2 . Outputs asserted during multiple contiguous states. This appears 
simple at first glance, but a few techniques exist that reduce logic 
complexity. One example is Contig. It’s asserted from State 3 to 
State 7, even though there's a branch at State 2. 

3. Outputs asserted during multiple, non-contiguous states. The 
best solution is usually brute-force decoding of the active states. 
One such example is Multi, which is asserted during State 2 and 
State 4. 

OHE makes defining outputs easy. In many cases, the state flip-flop is 
the output. For example, the Single output also is the flip-flop output 
for State 6; no additional logic is required. The Contig output is 
asserted throughout States 3 through 7. Though the paths between 
these states may vary, the state machine will always traverse from 
State 2 to a point where Contig is active in either State 3 or State 4. 

There are many ways to implement the output logic for the Contig 
output. The easiest method is to decode States 3,4,5,6, and 7 with a 
5-input OR gate. Any time the state machine is in one of these states, 
Contig will be active. Simple decoding works best for this state 
machine example. Decoding five states won’t exceed the input capa¬ 
bility of the FPGA logic block. 
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Additional Logic 

However, when an output must be asserted over a longer sequence of 
states (six or more), additional layers of decoding logic would be 
required. These additional logic layers reduce the state machine's 
performance. 

Employing S-R flip-flops gives designers another option when 
decoding outputs over multiple, contiguous states.Though the basic 
FPGA architecture may not have physical S-R flip-flops, most macro¬ 
cell libraries contain one built from logic and D-type flip-flops. Using 
S-R flip-flops is especially valuable when an output is active for six or 
more contiguous states. 

Tine S-R flip-flop is set when entering the contiguous states, and reset 
when leaving. It usually requires extra logic to look at the state just 
prior to the beginning and ending state. This approach is handy 
when an output covers multiple, non-contiguous states, assuming 
there are enough logic savings to justify its use. 

In the example. States 3 through 7 can be considered contiguous. 
Contig is set after leaving State 2 for either States 3 or 4, and is reset 
after leaving State 7 for State 1. There are no conditional jumps to 
states where Contig isn't asserted as it traverses from State 3 or 4 to 
State 7. Otherwise, these states would not be contiguous for the 
Contig output. 

Tine Contig output logic, built from an S-R flip-flop, will be set with 
State 2 and reset when leaving Shite 7 in the "S-R Flip-Flops" figure. 
As an added benefit, the Contig output is synchronized to the master 
clock. Obvious logic reduction tecliniques shouldn’t be overlooked 
either. For example, the Contig output is active in all states except for 
States 1 and 2. Decoding the states where Contig isn’t true, and then 
asserting the inverse, is another way to specify Contig. 

Tine Multi output is asserted during multiple, non-contiguous states - 
exclusively during States 2 and 4. Though Suites 2 and 4 are contig¬ 
uous in some cases, the state machine may traverse from State 2 to 
State 4 via Suite 3, where the Multi output is unasserted. Simple 
decoding of the active states is generally best for non-contiguous 
states. If the output is active is active during multiple, non-contig¬ 
uous states over long sequences, the S-R flip-flop approach described 
earlier may be useful. 
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One common Issue in state-machine construction deals with 
preventing illegal states from corrupting system operation. Illegal 
states exist in areas where the state machine's functionality is unde¬ 
fined or invalid. For state machines implemented in PAL devices, the 
state-machine compiler software usually generates logic to prevent or 
to recover from illegal conditions. 

In the OHE approach, an illegal condition will occur whenever two or 
more states are active simultaneously. By definition, the one-hot 
method makes it possible for the state machine to be in only one state 
at a time. The logic must either prevent multiple, simultaneous states 
or avoid the situation entirely. 

Synchronizing all of the state-machine inputs to the master clock 
signal is one way to prevent illegal states. "Strange" transitions won't 
occur when an asynchronous input changes too closely to a clock 
edge. Though extra synchronization would be costly in PAL devices, 
the flip-flop-rich architecture of an FPGA is ideal. 

Even off-chip inputs can be synchronized in the available input flip- 
flops. And internal signals can be synchronized using the logic 
block's flip-flops (in the case of the Xilinx LCAs). The extra synchroni¬ 
zation logic is free, especially in the Xilinx FPGA family where every 
block has an optional flip-flop in the logic path. 

Resetting State Bits 

Resetting the state machine to a legal state, either periodically or 
when an illegal state is detected, give designeis yet another choice. 
Tire Reset Direct (RD) inpuLs to the flip-flops are useful in this case. 
Because only one state bit should be set at any time, the output of a 
state can reset other state bits. For example. State 4 can reset State 3. 

If the state machine did fall into an illegal condition, eventually State 
4 would be asserted, clearing State 3. However, State 4 can't be used 
to reset State 5, otherwise the state machine won’t operate correctly. 
To be specific, it will never transfer to State 5; it will always be held 
reset by State 4. Likewise. State 3 can reset State 2, State 5 can reset 
State 4, etc.—as long as one state doesn't reset a state that it feeds. 

Tills technique guarantees a periodic, valid condition for the state 
machine with little additional overhead. Notice, however, that State 1 
is never reset. If State 1 were "reset", it would force the output of 
State 1 high, causing two states to be active simultaneously (which, 
by definition, is illegal). 
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Report Files 


This appendix includes report files from various synthesis vendors. 
To reduce the size of this appendix, some of the files are truncated 
(indicated by a series of dots) where information is repeated. This 
appendix contains the following sections. 

• "Synplicity" 

• "Exemplar Logic" 

Synplicity 

Synplicity' report files include the following. 

• Synthesis information on generated state machines and inserted 
items, such as clock buffets 

• Predicted timing performance, including maximum frequency of 
worst case paths 

• Area usage information for lOBs, carry logic, registers, FMAPs, 
HMAPs, and CLBs 

Note: The report file in this section is for a design compiled with the 
VHDL compiler. A report file for a design compiled with the Verilog 
compiler is essentially the same. 

Content-Type: text/plainj charset=*U5-ascii" 

Content-Disposition: attachment; filenane=’atm_chlpl_cd.log" 

S Start o£ Compile 
iMon Jan 12 07:59:16 1998 

Synplify VHDL Conpiler, version 3.0b, built Dec 17 1997 
Copyright (C) 1999-1997, Synplicity Inc. All Rights Reserved 

VHDL syntax check successful! 
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Compiler output is up to dote. Mo re-compile necessary 
Synthesizing work.acc_chip.schenat. 

8N:•c:\customer\atm\lcl . hd”:73:6:73:7|Trying to extract state 
machine for register wr_state_o 

Extracted state nachinc for register wr_statc_o 

State nachinc has 4 reachable states with original encodings of: 

11 

10 

01 

00 

8N: •c:\customer\atm\lcl__int 1 .vhd"': 73 :6: 73 : 7 | Trying to extract state 
machine for register rd_state_o 

Extracted state nachinc for register rd_state_o 

State nachinc has S reachable states with original encodings of: 
00001 
00010 
00100 
01000 
10000 

Post processing for work.acc_chip.schcmatic 
8END 

Process took 0.371 seconds realtime, 0.371 seconds eputime 
Synplify Xilinx Technology Mapper, version 3.0b, built Dec 21 1997 
Copyright <C) 1994-1997, Synplicity Inc. All Rights Reserved 
Setting fanout limit to 100 

8N:”o:\customer\atm\tokenl.vhd":103:8:103:9 I Found counter in 
view:work.TOKEN(vhdl_rt1) inst tckcn_cntr(5:0] 

(N: # c:\customer\atm\rx_agcnl.vhd*:61:6:61:7|Found counter in 
view:work.RX_ACD_GEN(vhdi.rt1J inst rx_20_add!19:0) 

(N:*e:\customer\atm\rx__agcnl.vhd”: 61 :6: 61 :7|Found counter in 
view:work.RX_ADO_GEM<vhdl_rtl) inst rx_10_add119:0) 

(N:*c:\customer\atm\slw_clkl.vhd*:21:10:21:11(Found counter in 
vieM:work.S10N_CLOCKS|vhdl_rtl) inst slow_counter(16:01 
8N:•e:\customer\atm\mngmntl.vhd*:998:4:999:5|Found counter in 
view:work.MNGMNT(vhdl_rt 1 ) inst tx.frame.cntr 111 : 0 ) 

8N:•c:\customer\atm\mngmnt1.vhd*:931:4:931:5|Found counter in 
vicw:work.MNGMNTIvhd1_r 11) inst rx__frame_cntr111:0) 

Clock Buffers: 

Inserting Clock buffer for port CLK_20M_3MP_I,TNM-CLK_20M_3MP_I 
Inserting Clock buffer for port CPU_WR_M_I, TMM-CPU_WR_M_I 
Inserting Clock buffer for port CLK_40M_P_I,TNM=CLK_40M_P_I 
Inserting Clock buffer for port CLK_40M_S2_I,TNM-CLK_4DM_S2_I 
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Net buffering Report.: 

No nets needed buffering. 


Timing reports 

Delay - This is the delay from a start point 
such as a register or prinary input. 

Slack - If this value is negative then it indicates 
the size of the tining violation. 

FO - This is the estimated fanout cr loading 
used in calculating net delays 


Requested default timing: 

Frcqucncy=15.0 MHz, Period=66.7 ns 

Estimated result: 

Frequcncy=16.7 MHz, Pcriod=60.0 ns 

Slack on longest path: 6.7 ns 

Timing Information for Longest Paths: 

Instance: Z_40ACCCH1PZ_32TXD2PZ_41.bascO_rcg2Q[121, cell DFFRE 

D I -Z_40ACCCH:PZ_32TXD2PZ_41.sync_tx20_0_bascQ_rog20_4{12]Delay=60.4, 
Siack=6.7 

Instance: Z_40ACCCHIPZ_32TXD2PZ_41.baseO_rag2Q [ 181, cell DFFRE 

D I -Z_40ACCCHIPZ_32TXD2PZ_4l.sync_tx20_0_basc0_rcg20_4lIS 1Dclay=59.8, 
Slack=7.3 

Instance: Z_40ACCCH1PZ_32TXD2PZ_41.bascO_rcg2Q[151, cell DFFRE 

D I -Z_40ACCCHIPZ_32TXD2PZ_41.sync_tx20_0_bascO_rcg20_4[15|Dclay=59.0, 
Slacked.1 

Instance: Z_40ACCCHIPZ_32TXD2PZ_41.basc3_rag2Q[151, cell DFFRE 
D I - Z_40ACCCHIPZ_32TXD2PZ_41.N_i956Delay=58.6, Slack-8.5 

Instance: Z_40ACCCHIPZ_32TXD2PZ_41.basc0_rcg20[111, cell DFFRE 

D I -Z_40ACCCHIPZ__32TXD2PZ_4i.sync_tx20_Q_basc0_rcg20_4[11|Dclay=58.4, 
Slack-3.7 

Instance: Z_40ACCCHIPZ_32TXD2PZ_41.basc3_rog20(13], cell DFFRE 

D I -Z_40ACCCHIPZ_32TXD2PZ_41.sync_tx20_3_basc3_rcg20_4(131Dclay=58.3, 
Slack-8.8 

Instance: Z_40ACCCHIPZ_32TXD2PZ_41.basel_rog2G[13!, cell DFFRE 

D I -Z_40ACCCHIPZ_32TXD2PZ_41.sync_tx20_l_bascl_rcg20_4[131Dclay=58.3, 
Slack-8.8 
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Resource Usage Report 


Mapping tc part: 4012Sxvbg580-l 
I/O primitives: 

OBUF 98 uses 

IBUF 43 uses 

OUTFF INIT-R 14 uses 

INFF INIT=R 10 uses 

BUFG 4 uses 

Carry primitives used tor arithmetic functions: 

INC-FG-1 11 uses 

INC-FG-CI 121 uses 

FORCE-0 14 uses 

ADD-FG-CI 183 uses 

EXAMINE-Cl 55 uses 

FORCE-1 30 uses 

ADD-G-F1 4 uses 

SUB-FG-CI 180 uses 

DEC-FG-0 3 uses 

DEC-FG-CI 12 uses 

Register bits not including I/Os: 1258 

Logic Mapping Sunmary: 

FMAPs: 3059 of 9248 (34%) 

HMAPs: 847 of 4824 (14%) 

Total packed CLBs: 1530 of 4624 (34%J 

(Packed CLBs is deternined by the larger of three quantities: 

Registers / 2, HMAPs, or FMAPs / 2.) 

Mapper successful! 

Process took 189.984 seconds realtime, 189.984 seconds eputime 

Exemplar Logic 

The area section of the Exemplar Logic report includes a tea utiliza¬ 
tion information for FMAPs. HMAPs, CLBs, IO buffers, and IOB 
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registers. The timing section of the report lists the predicted timing of 
the critical path, and shows the number of levels of logic and buffers. 

Summary Report 

Content-Type : tcxt/plam; charset=*us-asci i H 
Content-Disposition: attachment; filcnanc=*ram.sun* 


Cell: ncm View: behav Library: work 


Total accumulated area : 


Number 

of 

FG Function Generators : 

4 

Hunter 

of 

Packed CLBs : 

1 

Hunter 

of 

IBUF : 

9 

Hunter 

of 

OBUF : 

1 

Hunter 

of 

IOB Output Flip Flops : 

2 

Number 

of 

ports : 

10 

Nuaber 

of 

nets : 

24 

Hunter 

of 

instances : 

17 

Hunter 

of 

references to this view : 

0 


Cell 

Library 

References 

Total Area 


F2_LUT 

xi4cx 

2 x 

1 

2 FG Function 

Generators 

OFDTX 

xi4cx 

2 x 

1 

2 IOB Output 

Flip Flops 

OBUF 

xi4cx 

1 X 

1 

1 OBUF 


GND 

x i 4cx 

1 X 

1 

1 GND 


IBUF 

xi4cx 

9 x 

1 

9 IBUF 


RAT-11 6x 1S 

xi4cx 

2 x 

1 

2 FG Function 

Generators 


Using wire table: 4052cx-3_avq 

Slack Table at End Points 


End points 


Slack Arrival 

rise fall 


Required 
rise fall 
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ix32d_i5_1_I0_10_0_10_10/D : 

n/a 

20 

.02 

20.92 

n/a 

n/a 

ix32d_15_0_10_10_0_10_10/D : 

n/a 

20 

.02 

20.92 

n/a 

n/a 

dio| 0 )/ : 

n/a 

15. 

90 

16.90 

n/a 

n/a 

dio|l>/ : 

n/a 

15. 

90 

16.90 

n/a 

n/a 

ix32d_15.l_10_10_0.10.!0/WE 

n/a 

11 

.90 

11. BO 

n/a 

n/a 

ix32d_15_0.10_10_0_10.10/WE : 

n/a 

11 

.90 

11. B0 

n/a 

n/a 

ix32d_rcq_qll)_Ol/D : 

n/a 

9. 

12 

9.12 

n/a 

n/a 

ix32d_reg_qlO)_Ol/D : 

n/a 

9. 

L2 

9.12 

n/a 

n/a 

ro/ : 

n/a 

d. 55 

d. 55 

n/a 

n/a 

ix32d_15.0_10_10_0_10.10/WCLK : 

n/a 

7. 

06 

7.06 

n/a 

n/a 


Critical Path Report 


Critical path {unconstrained path} 

NAME GATE ARRIVAL LOAD 


»c/ 


0.00 

up 

2.32 

ix351/0 

IBUF 

4.12 

up 

2.94 

ix32d_nx4/0 

F2.LUT 

B.B6 

up 

2.94 

ix32d_rcq_q|0)_Ol/O 

OFDTX 

16.B0 

dn 

2.94 

ix32d_rcq_q|0)_Il/O 

IBUF 

18.60 

dn 

2.32 

ix32d_i5.0.10_10_0_10_10/D 

data arrival time 

RAMI 6x1 S 

20.92 

20.92 

dn 

0.00 


data 

required tine 

not specified 

data 

required tine 

not specified 

data 

arrival time 

20.92 



unconstrained path 


Critical path tl, {unconstrained path} 



NAME 

GATE 

ARRIVAL 

LOAD 

memo/ 


0.00 up 

2.32 

ix352/Q 

IBUF 

4.12 up 

2.94 

ix32d_nx4/0 

F2.LUT 

d.86 up 

2.94 

ix 3 2d_rcq_q10)_Ol/O 

CFDTX 

16.80 dn 

2.94 

ix32d_rcq_q10).1I/O 

IBUF 

18.60 dn 

2.32 
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ix32fi 15 0 10 10 0 10 10/D 

RAMI6xIS 

20.92 dn 

0.00 

data 

arrival time 


20.92 


data 

required tine 



not specified 

data 

data 

required tine 
arrival time 



not specified 
20.92 





unconstrained path 


Log Report 

Content-Type: tcxt/plain; charsct=*us-ascii" 

Content-Disposition: attachment; filenanc = *ram. log* 

C:\Program FilcsvExomplar Lcgic\Galiloo 4.2\bin\win32\gc.exe \ 

F :/re14.2/example/ram. vhd F:/rc!4.2/cxanple/ram.edf -input_fcrnat=VHDL \ 
-targct=xi4ex -output_format=EDIF -area -ef£ort=quick V 
-edi£_t inng_f 1 le-F:/rel4.2/exanplc/ram.t im -cncodmg=OneHot - 
wiro_trco-Worst \ 

-noccntrol -vhdl_93 -proccss=3 -wirc_table=4052cx-.3_avg -chip 


Galileo - V4.2 (build 2.01, compiled Dec 19 1997 at 16:40:4&» 
Copyright 1990-1996 Exemplar Logic, Inc. All rights reserved. 

Checking Security ... 

Info: setting encoding to CneHot 

Info: setting process to 3 

Info: setting vcire — tree to Worst 

Info: setting wirc_table to 4052cx-3_avg 

— Welcome to Galileo 
-- Run By massouni#WACO 

— Run Started On Tuc Jan 13 11:05:09 Pacific Daylight Tine 1990 
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-- read -format. YHDL (F : /rel4.2/example/ram. vhd I 

— Reading file C:\PR0GRA-1\EXEHPL-1\GALILE-1.2\data\standard.vhd for 
unit standard 

— Leading package standard into library std 

-- Reading vhdl file F:/rel4.2/example/ram.vhd into library work 

— Reading file C:\PROGRA-l\EXEHPL-l\GALILE-1.2\data\std_i164.vhd for 
unit std_logic_1164 

— Leading package std_logic_l164 into library lccc 

— Reading file C:\PROSRA-l\EXEMPL-l\GALILE-1.2\data\ex_l164.vhd for unit 
cxenpla r_116 4 

— Leading package exemplar.. 1164 into library exemplar 

— Reading file C:\PROGRA-l\EXEMPL-l\GALILE-i.2\data\excmplar.vhd for 
unit exemplar 

— Leading package exemplar into library exemplar 

— Leading package my_pkg into library work 

— Leading entity mem into library work 

-- Leading architecture behav of mem into library work 

"F:/rcl4.2/cxarrplc/ran. vhd*, line 20: Warning, output ro is never assigned 
a value. 

— Compiling root entity mem{bchav) 

— Reading target technology xi4cx 

Reading library file 'C:\PRCGRA-l\EXEMPL-l\GALILE-1.2UiD.xi4ex.syn*... 
Library version = 0.9 
Delays assune: Proccss=3 

-- Prc Optimizing Design .work.ncm.behav 

INFO: Using Ram Cell ram_io_inclock_outclock_2_3_8. 

-- Read Module Generators 

-- Reading nodule generator description from file 
C: \PROGRA-1 \EXEMPL- 1 \CALILE-1.2\data\rrodgen\xi4c. vhd 

— Reading vhdl file C:\PRCGRA-1\EXEMPL-1\GALILE-1.2\data\nodgen\xi4c.vhd 
into library OPERATORS 

— Hcdgcn File xi4e.vhd Version 4.20 

-- Resolving Mcdgcn With modgen_selcct "small" 

— Start nodule generator resolving for design .work.mem.behav 

— Resolving function ran_io with nodule generator 
ram_io_2_3_8_true_truc_fa1sc from file xi4c.vhd 

-- optimize -target xi4ex -effort quick -chip -area 

— Start optimization for design .work.ncm.behav 
Using wire table: 4052cx-3_avg 

Pass Area Delay DFFs Pis POs —CPU-- 

<FGs} Ins} min: sec 

1 2 17 2 7 3 00:00 

Info, setting outputs in top level view 'behav' to fast. 

Using wire table: 4052cx-3_avg 
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-- Start tining optimization for design .work.mem. fcchav 

Latest arrival time at prinary output: 20.9 ns 
Latest arrival time at register input: 20.9 ns 

forcing timing constraints at ail end points: 19.8 ns 

Initial Timing Optimization Statistics: 


Most Critical Slack : -2.1 

Sum cf Negative Slacks : -4.2 

Longest Path : 20.9 ns 

Area : 2.0 


Final Timing Optimization Statistics: 


Most Critical Slack : -2.1 

Sum cf Negative Slacks : -4.2 

Longest Path : 20.9 ns 

Area : 2.0 


Total time taken : 0 epu secs 
Using wire tabic: 4052cx-3_avg 

— Start tincspcc generation for design .work.mem.bchav 


Cell: ncm View: bchav Library: work 




Nunfccr of ports : 10 
Nunfccr of nets : 24 
Nunfccr of instances : 17 
Nusher of references to this view : 

Total accumulated area : 

Nunfccr of FG Function Generators : 4 
Nunfccr of Packed CLBs : 1 
Nunfccr of IBUF : 9 
Nunfccr of OBUF : 1 
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Nunbcr ot IOB Output Flip Flops : 2 

-- Writing file F:/rcl4.2/cxanple/ran.cdf 

— CPU time taken for this run was 29.55 see 

— Run ended On Tuc Jan 13 11:05:37 Pacific Daylight Time 1993 

— Galilee run successfully completed. Goodbye ! 

Synopsys FPGA Express 

FPGA Express™ reports include information on the following. 

• Synthesis options used 

• Primitives used 

• Required and predicted frequency of the clocks 

• Critical paths 

Chip fd32cc-Optinizcd 


Summary Information: 

Type: Optimized implementation 

Source: fd32ce, up to date 

Status: D errors, 0 warnings, 1 messages 

export: not exported since last optimization 

Target Information: 


Vender: Xilinx 
Family: XC40Q0 
Device: 40A5XLBG560 
Speed: xl-09 

Chip Parameters: 


Optimize for: Speed 

Optimization effort: High 

Frequency: 50 MHz 

Is nodule: No 

Keep io pads: No 

Number of : ops: 32 

Number of latches: 0 
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Chip Design Hierarchy: 


id22c.cz defined in /home/olayda/drcust/stapp2/testl/fd32co.v 
Prinitive reference count: 


BUFG 1 
IBUF 2 
INFF 32 
OBUF 32 
STARTUP 1 


Clocks: 





Required 

Estimated 


Period 

Rise 

Fall 

Freq 

Freq 

Signal 

(ns) 

(ns) 

(ns) 

(MHz) 

(MHz) 


20 

0 

10 

50.00 

n/a 

default 

n/a 

n/a 

n/a 

n/a 

100.00 

clcck_BUFGcd 


Timing Groups: 


Name Description 


<I> 

<o> 

(RC,clock_BUFGcd) 


Input ports 
Output ports 

Clocked by rising edge of clock_BUFGcd 


Timing Path Groups: 


>From 


To 


Required 

Delay 

(ns) 


Estinated 

Delay 

(ns) 


(RC,clock_BUFGed) 


IRC,clock_BUFGcd) 
IO) 


20.00 a.42 

20.00 9.0fl 


Input Port Timing: 


Required 

Port Delay 

Name Ins) 


Estinated 

Slack 

(ns) To-Group 
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clock 

11-56 

11.58 (RC, cloc k_BUFGed) 

CO 

n/z 

n/a (RC, clock_BUFGed) 

reset 

n/z 

n/i 

i (RC,c1oc k_BUFGed) 

data_in<31> 

12.1C 

12.1C 

1 < RC,c1oc k_BUFGed) 

data_in<30> 

12.1C 

12.1C 

) (RC,clock_BUFGcd) 

data__in<29> 

12.1C 

12.1C 

I (RC, clock_BUFGed) 

data_in< 28 > 

12.1C 

12.1C 

) (RC,c1oc k_BUFGcd) 

data_in<27> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in< 26 > 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in<25> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in<24> 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in<23> 

12.1C 

12.1C 

* < RC,c1ock_BUFGcd) 

data_in< 22 > 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in< 21 > 

12.1C 

12.1C 

) < RC,c1ock_BUFGcd) 

data_in< 20 > 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in<19> 

12.1C 

12.1C 

l < RC,c1oc k_BUFGcd) 

data_in< 18 > 

12.1C 

12.1C 

1 < RC,c1oc k_BUFGed) 

data_in<17> 

12.1C 

12.1C 

1 < RC,c1oc k_BUFGed) 

data_in< 16 > 

12.1C 

12.1C 

) (RC,c1oc k_BUFGcd) 

data_in<15> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in<14> 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in<13> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in< 12 > 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in<l 1 > 

12.1C 

12.1C 

\ (RC,c1oc k_BUFGcd) 

data_in< 10 > 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in<9> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGcd) 

data_in<8> 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in<7> 

12.1C 

12.1C 

\ < RC,c1oc k_BUFGed) 

data_in<6> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGcd) 

data_in<5> 

12.1C 

12.1C 

1 < RC,c1oc k_BUFGcd) 

data_in<4> 

12.1C 

12.1C 

) (RC,c1oc k_BUFGed) 

data_in<3> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in< 2 > 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in<l> 

12.1C 

12.1C 

) < RC,c1oc k_BUFGed) 

data_in< 0 > 

12.1C 

12.1C 

) < RC,c1oc k_BUFGcd) 

Output Port Timing: 





Required Estimated 

Port 

Delay 

Slack 


Marne 

(ns) 

(ns) 

Frcm-Group 

data_out<31> 

20.OC 

10.92 (RC, clock_BUFGed) 

data_out<30> 

20.0C 

10.92 (RC,clock_BUFGed) 
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data_out<29> 20.00 
data_out< 28 > 20.00 
data_out<27> 20.00 
data_out< 26 > 20.00 
data_out<25> 20.00 
data__out<24> 20.00 
data_out<23> 20.00 
data_out< 22 > 20.00 
data_out< 21 > 20.00 
data_out< 20 > 20.00 
data_out<19> 20.00 
data_out<lfl> 20.00 
data_out<17> 20.00 
data_out< 16 > 20.00 
data_out<15> 20.00 
data_out<14> 20.00 
data_out<13> 20.00 
data_out< 12 > 20.00 
data_out<ll> 20.00 
data_out< 10 > 20.00 
data_out<9> 20.00 
data-out<fl> 20.00 
data_out<7> 20.00 
data_out<6> 20.00 
data_out<5> 20.00 
data_out<4> 20.00 
data_out<3> 20.00 
data_out< 2 > 20.00 
data_out<l> 20.00 
data_out< 0 > 20.00 


Critical Path Tining: 


Cell 

Type 

Arrival 

Time 

Ins) 

Rcquired 
Tine 

(ns) 

port 

9.06 

20.00 

OBUF 

9.06 

20.00 

OBUF 

4.36 

15.30 

INFF 

1.30 

12.22 

INFF 

0.00 

10.92 


10.92 (RC,clock_BUFGed) 

10.92 (RC,clock_BUFGod) 

10.92 (RC, clock_BUFGed) 

10.92 (RC,clock_BUFGod) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC, clock_BUFGcd) 

10.92 < RC, c 1 ock_BUFGcd) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC, clock_BUFGod) 

10.92 (RC,clock_BUFGod) 

10.92 (RC,clock_BUFGed) 

10.92 (RC,clock_BUFGed) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,clock_BUFGcd) 
L0.92 (RC,clock_BUFGoa) 

10.92 (RC,clock_BUFGed) 

10.92 (RC,clock_BUFGed) 

10.92 (RC, clocksBUFGcd) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,c1oc k_BUFGod) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,clock_BUFGed) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,clock_BUFGed) 

10.92 (RC, clock_BUFGcd) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,clock_BUFGcd) 

10.92 (RC,clock_BUFGed) 

10.92 (RC,c1ock_BUFGed) 


Fanout 

Count Pin-Name 


0 /fd32cc-Optimizcd/dat 

0 /fd32cc-Cpt amizcd/Cl7 

1 /fd32cc-Optimizcd/Cl7 

1 /t'd32ce-Opt imizcd/dat 

32 /fd32ce-Optimizcd/dat 
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A 

active low gsr design. 4-23 
after xx ns statement. 2-2 
arithmetic functions 
gate reduction. 2-37 
ordering and grouping. 2-3 
resource sharing. 2-32 
ASIC 

comparing to FPGA. 1-3. 2-1 
asynchronous reset pin. 2-38 
asynchronous set pin. 2-38 

B 

barrel shifter design. 2-18 
bi-directional I/O. 4-71 
inferring. 4-72 
instantiating. 4-74 
using LogiBLOX. 4-76 
binary encoded state machine. 4-26 
boundary scan. 4-63 

instantiating in HDL. 4-63 
BSCAN. 4-63 
BUFGP. 4-3 
BUFGS. 4-3 
BUFT see tristate buffer 

C 

capitalization style in code. 2-4 
case statement. 2-3 

comparing to if statement. 2-55 
design example. 2-58 


syntax. 2-47 
when to use. 2-47 

CLB 

XC4000. 2-39 
clear pin. 2-38. 4-9. 4-16 
clock buffers 
inserting. 4-6 
instantiating. 4-2. 4-6 
clock enable pin. 2-38. 2-42 
combinatorial feedback loop. 2-29 
comments in code. 2-13 
compile run script. 3-4 
compiling large designs, 3-5 
compiling your design. 3-4. 3-5 
conditional expression. 2-27 
constants. 2-8 
constraint precedence. 3-11 
cost-based clean-up option. 3-23 
creating readable code. 2-10 

D 

D register. 2-29 
design. 2-7 
decoders. 4-40 

delay-based clean-up option. 3-23 
design compiling. 3-4 
design entry. 3-3 
design flow 

description. 3-1 
diagram. 3-2 

using the Design Manager. 3-2 
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design hierarchy. 3-3. 4-90. 4-91 
design performance. 3-18 
device downloading. 3-24 
directory tree structure. 1-10 
disk space requirements. 1-6 
downloading files. 1-7 
downloading to the device, 3-24 

E 

ED1F file. 3-5 

else statement. 2-29 

entering your design. 3-3 

enumerated type encoded state machine. 

4-32 

Exemplar Logic report files. B-4 
extracting downloaded files. 1-9 

F 

Field Programmable Gate Array see FPGA 
file transfer protocol. 1-8 
Finite State Machine. 4-32. 4-39 
changing encoding style. 4-39 
extraction commands. 4-32 
flip-flop. 2-30 
formatting styles. 2-4 
FPGA 

comparing to ASIC. 1-3. 2-1 
creating with HDLs. 4-1 
global clock buffer. 4-2 
system features, 1-4. 4-1 
FPGA Express report files. B-10 
from 

to style timing constraint. 3-8 
FSM see Finite State Machine 
functional simulation. 1-2. 3-4. 5-2. 5-4 
comparing to synthesis. 2-1 

G 

gate reduction 
definition. 2-37 
gated clocks. 2-42 


global clock buffer. 4-2 
global longlines. 4-5 
global set/reset. 4-9. 5-23. 5-48 
increasing performance. 4-10 
STARTUP block. 4-9 
test bench. 5-49 
global signals. 5-18. 5-22 
GSR see global set/reset 
GSR1N. 4-9 
GTS. 5-35. 5-64 
guide option. 3-23 

H 

hardware description language see HDL 
HDL 

also see Verilog 
also see V'HDL 
coding for FPGAs. 4-1 
converting to gates. 1-2 
definition. 1-1 
designing FPGAs. 1-2. 1-3 
FPGA system features. 4-1 
boundary scan. 4-63 
global clock buffer. 4-2 
global set/reset. 4-9 
I/O decoders. 4-40 
implementing logic with IOBs. 

4-67 

on-chip RAM. 4-52 
implementing registers. 2-27 
schematic entry design hints. 2-17 
hdl resource allocation command. 2-34 
hdlin check no latch command. 2-30 
hierarchy in designs. 1-4 
high-density design flow. 3-1 
hold-time requirement. 2-29. 4-67 

I 

I/O decoder. 4-40 
if statement. 2-30 
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comparing to case statement. 2-55 
design example. 2-56 
registers. 2-30 
syntax. 2-45 
when to use. 2-46 
if-case statement 

design example. 2-5 1 
il-else statement. 2-3. 2-45 
ignore timing paths. 3-9 
indenting HDL code. 2-10 
1N1T=S attribute. 4-10. 4-16. 4-40 
initialization statement. 2-4 
Insert Pads command. 4-2 
installation 

design examples. 1-5 
directory tree structure. 1-10 
disk space requirements. 1-6 
downloading files. 1-7 
extracting downloaded files. 1-9 
file transfer protocol. 1-8 
internet site. 1-7. 1-8 
memory requirements. 1-5 
tactical software. 1-5 
internet site. 1-7 
IOB 

implementing logic. 4-67 
moving registers. 4-79. 4-80 
unbonded. 4-81 

J 

JTAG 1149.1. 4-63 

L 

labeling in code. 2-7 
latch 

combinatorial feedback loop. 2-29 
comparing speed and area. 2-31 
converting to register. 2-29 
D flip-flop. 2-30 

D latch implemented with gates. 2-28 
hdlin check no latch command. 2-30 
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implementing in HDL. 2-27 
inference. 2-46 
latch count. 2-30 
RAM primitives. 2-30 
libraries. 5-11 
LogiBLOX 

bi-directional I/O. 4-76 
implementing memory. 4-59 
instantiating modules. 4-45 
libraries. 5-11. 5-16 
LogiCORE 

library. 5-17 

M 

mapping your design 

using design manager. 3-13 
using the command line. 3-15 
maxskew. 3-10 
memory 

implementing in HDL. 4-52 
requirements. 1-5 
Modelsim simulator. 5-48 
multi-pass pLice and route option. 3-21 
multiplexer 

comparing gates and trlstate buffer. 4- 
89 

implementing with gates. 4-86 
implementing with tristate buffer. 4-84 
resource sharing. 2-32 

N 

named association. 2-9 
naming conventions. 2-5. 2-6 
nested if statement. 2-48 
no gsr design. 4-11 
NODELAY attribute. 4-67 

o 

offset constraint. 3-9 
OMUX. 4-69 

one-hot encoded state machine. 4-35 
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oscillators. 5-44 
output multiplexer. 4-69 

P 

pad location. 4-79 
period constraint. 3-8 
pipelining. 4-89 

placing and routing your design. 3-19 

port declarations. 2-14 

positional association. 2-9 

post-route full timing simulation. 5-9 

post-synthesis simulation. 5-5 

-pr option. 4-80 

preset pin. 2-38. 4-9. 4-10. 4-16 

priority-encoded logic. 2-48. 2-55 

PROM file, 3-24 

pull-downs. 4-69 

pull-ups. 4-69 

R 

RAMs 

inferring. 4-58 
instantiating. 4-56 
re-entrant routing option. 3-22 
register 

clear pin. 2-38 

converting latch to register. 2-29 
D register. 2-29 
if statement. 2-30 
implementing in HDL. 2-27 
inference. 2-39 
moving into lOB. 4-79 
preset pin. 2-38 
report files. B-l 
report liming command. 3-13 
reset on configuration. 5-24 
reset on configuration buffer. 5-29 
resource sharing 
CLB count. 2-37 
definition. 2-31 
delay. 2-37 


design examples. 2-32 
disabling. 2-34 
ROC. 5-24 
ROCBUF, 5-29 
ROMs. 4-53 

RTL simulation. 4-11. 4-17. 5-1. 5-5 
definition. 1-2 

s 

schematic entry design hints. 2-17 
set don't touch attribute. 4-63 
signal skew. 3-10 
signals. 2-15 
SimPrim libraries. 5-11 
simulating your design, 5-1 
simulation 

creating a test bench. 5-7 
functional. 5-4 
global signals. 5-18 
industry standards. 5-11 
library source files. 5-13 
post-map. 5-6 
post-NGDBuild. 5-6 
post-synthesis, 5-5 
timing. 5-8 

simulation diagram. 5-2 
slew rate. 4-68 
software requirements. 1-5 
Spartan 

lOB. 4-67 

STARTBUF. 4-9. 5-32 
STARTUP block. 4-9. 4-17 
startup state. 4-10 
state machine. 4-26 

binary encoded. 4-26 
bubble diagram. 4-27 
encoding style summary. 4-38 
enumerated type encoded. 4-32 
design example, 4-33.4-35 
enumeration type. 4-38 
initializing, 4-40 
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limiting fan-in. A-3 
one-hot encoded. 4-35. A-l 
one-state vs. binary encoded. A-8 
resetting state bits. A-l I 
seven states. A-3 
std logic data type. 2-13 
Synopsys 

creating compile run script. 3-4 
Synplidty report files. B-l 
synthesis 

comparing to simulation. 2-1 

T 

TCK pin. 4-63 
TD1 pin. 4-63 
TDO pin. 4-63 
test bench. 5-7 
TIG. 3-9 
TIMEC.RPs. 3-6 
timing 

constraint precedence. 3-11 
constraint priority. 3-10 
constraints. 1-5. 3-5. 3-8 
requirements. 1-5 
simulation. 3-23. 5-2. 5-8 
simulation netlist 

Command Line. 5-10 
Design Manager. 5-9 
TMS pin. 4-63 
TNMs. 3-6 
TOC. 5-36 
TOCBUF. 5-41 
TPSYNC keyword. 3-7 
tristate buffer 

comparing to gates. 4-89 
implementing multiplexer. 4-84 
tristate enable. 5-35 
tristate on configuration. 5-36 
tristate on configuration buffer. 5-41 
turns engine option. 3-21 


u 

UCF. 4-79 

unbonded lOBs. 4-81 
UniSim libraries. 5-11. 5-13 
use. gsr design. 4-17 
user constraints file. 4-79 

V 

variables. 2-15 
Verilog 

capitalization style. 2-4 
constants for opcode. 2-9 
definition. 1-3 
global set/reset. 5-48 
GTS. 5-64 
libraries. 3-47 

parameters for constants. 2-9 
register inference. 2-41 
VHDL 

after xx ns statement. 2-2 
also see HDL 
arithmetic functions. 2-3 
capitalization style. 2-5 
case statement. 2-3 
coding styles. 2-4 
constants. 2-8 
constants for opcode. 2-8 
definition. 1-3 
if-eLse statement. 2-3 
initialization statement. 2-4 
naming identifier. 2-6 
naming packages. 2-6 
naming types. 2-6 
register inference. 2-39 
simulation. 2-1 
std logic data type. 2-13 
synthesis. 2-1 
variables for constants. 2-8 
wait for xx ns statement. 2-2 
Xilinx naming conventions. 2-5 
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VHSIC Hardware Description Language 
see V’HDL 

w 

wail for xx ns slalemenl. 2-2 

X 

XC4000 

CLB. 2-39 
IOB. 4-67 
XC5200 

IOB. 4-71 

XDW libraries. 5-11 
Xilinx internet site. 1-8 
XNF file. 3-5 
xor sig design. 2-16 
xor var design. 2-17 
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